Omg it is SO REFRESHING to see language designers acknowledging that REPLs exist and that yes, indentation-sensitivity FAILS HARD IN A FIRE in a REPL

(from the Magritte language thesis, files.jneen.net/academic/thesis.pdf )

My concept of T-expressions is roughly filling the same role that 'skeleton trees' (or rather, informal use case of vectors used to implement them) seem to be in Magritte. A concrete syntax tree.

It's only slightly more expressive than a raw vector, and only in the sense that it adds the equivalent of 'cdr' to a vector, and formalises the informal pattern of 'first cell describes the data type'.

I think this extra information is important, but that's the hardest idea to sell.

I think Magritte is on the wrong track with using % for lexical variables though. I'd stick with the familiar $ and use % only for dynamic variables, which should be a much smaller set.

@natecull
Very interesting thread, thanks for posting! A couple of thoughts:

> Javascript seems to be the current hothouse for language design… It's fun to see this Cambrian explosion of language concepts

I really wonder if #racket will get popular enough to trigger an even bigger Cambrian explosion. It's certainly what they're going for

> maybe arrays beat (ie, are a more universal storage abstraction mechanism than) conses.

Any thoughts on #apl? Not minimal syntax, but similar ideas

@codesections @natecull On the arrays point, my understanding is that arrays are more efficient on modern hardware but they're not more universal. Universality afterall is trivial!

I mean machine code's underlying memory model is an array of numbers, though caches do add an (invisible) wrinkle.

@alcinnz @codesections

Right, it seems to me that the basic Von Neumann machine model is 'array/vector of unsigned integers of some word size' and then we go from there.

I guess what I want to know is:

If we're building our own virtual machine that doesn't start from, say, the POSIX environment - what's the smallest model we can get away with?

PicoLisp seems pretty small, but it builds in some nasty stuff (eg evaluate an integer and you execute machine code of the attacker's choosing).

@alcinnz @codesections

The 'serverless' people in the Cloud space are moving in on this territory. Amazon Lambda etc.

I'd like to see the non-proprietary, distributed-web community have an answer to what 'serverless function in the DWEB' might look like.

Either a very small Lisp or a very small Forth *might* be the right answer. Or, it might not. Forth in particular makes some very bad assumptions about security. And Lisp maybe doesn't play well with the C/POSIX memory layout. Dunno.

@alcinnz @codesections

On 'universality being trivial':

Yes, and also no.

Anything TM-ish can emulate any other TM, yes. But that's not always the most relevant criterion for human usage of a computing system.

I suppose what I keep wanting is some kind of data-plus-computing substrate that can 'capture, share and remix most of human thought'. Something like a hyperlinked spreadsheet (data plus formulas/functions) but a bit more freeform.

SQL databases seem a bit too constrained.

@alcinnz @codesections

The idea of homoiconicity I think is quite important. Without it, yes you can 'emulate' any data/rule set in any other data/rule markup language or format... but you can't always preserve it in the same 'shape'.

Preserving 'shape' (structure + syntax) seems important. Each time you transform data, you introduce the possibility of errors. So I think we'd like a format that introduces as little of its own structure as possible, ie, has a very minimal/general structure.

@natecull @codesections Yes, basically what I want to encourage with the phrase "universality is trivial" is to focus on the human factor.

Theoretically all we really need is to read/write arbitrary ammounts of data and write conditions upon them. In practice we need to figure out what that data is meant to mean.

Lisp's S-Expressions and Haskell's Abstract Data Types do strike me as trivial yet useful memory models though.

@alcinnz @codesections

Mmm. I think I'm trying for a 'concrete data type' rather than an abstract one. Or a 'concrete parse tree' perhaps.

Objects or ADTs do have the nice property that if you've got one in memory, you know it was generated 'correctly' (at least for this run of the program) because only one object (the constructor) is allowed to create them. But... this property gets VERY murky once we look beyond one single runtime and bring disk storage and network transmission into play.

@alcinnz @codesections

A fairly universal definition of 'object' seems like: 'a bunch of sequential storage cells beginning with an object type/class/prototype/identifier, which is probably itself a storage address pointing to an object that describes it'.

Beyond that, it gets very hard to agree on semantics. So if we could get at least a readable/writeable version of 'type-labeled chunk of storage', that seems a bit of a win.

Working out how to break cycles and uniquely-name identifiers.

@alcinnz @codesections

Also things like: If an object requires a 'context' (another object) to be interpreted, find a way to automatically locate that context object and bring it into local storage.

Where 'context' might include things such as: class/prototype of an OO object; symbol table of a bunch of Lisp cells; libraries of a linked executable; environment of a closure.

Lua's idea of 'metatables' seems relevant and useful. It comes so very close to being a universal language.

@alcinnz @codesections

I *think* Racket's idea of 'syntax' objects is somewhat like a cons marked up with a context?

I'd like to see something like that, but universal.

It probably needs to be slightly less granular than at the cons level, though; otherwise we triple our storage cost. Which is why I keep thinking about type-tagged arrays/vectors (ie very lightweight objects) as more universal than conses.

@alcinnz @codesections

When I say 'triple the storage cost': an absolutely trivial and dumb universal storage layer would be a triple store. Very much like RDF. Every cell in the entire RAM space being a triple of

(type, car, cdr)

We probably don't want that. Wastes a huge amount of storage. But I think it would be the most general and universal possible data-management system, that removes some ambiguities that cons storage has and is more extendable.

We could do Type with one bit, though.

Follow

@alcinnz @codesections

ie: if most (car, cdr) cells are understood to be pointers to at least a 16-bit storage space (probably 32 or 64-bit), then we can steal bits to represent some well-known Types, reserving one to say 'the car is the Type, cdr is the payload'.

@alcinnz @codesections

I think most Lisps do this, or similar (PicoLisp steals the bits from the pointer to the cons, not from the car/cdr of the cons itself). But they tend to only allow built-in type tags, and they don't have a generally agreed-on extensible mechanism for reading/writing user-defined type tags to a serialised ASCII representation.

I think we need that, otherwise if you have in-memory types that can't be 100% correctly serialised, you don't really have a data storage layer.

@alcinnz @codesections

And that's maybe the crux of my general feeling of unhappiness with the current leading edge of computer science and programming languages theory (Haskell et al and the type people):

I think they're focussing so much on 'correctness of computing / computability' that they have lost sight of 'correctness of *data import/export fidelity*'. And I think data is more important than computation.

Data outlives computation and passes through multiple computing machines.

@alcinnz @codesections

The type theory and big compilation up front people are pushing so hard on 'restrict the allowable / expressible forms of data so that computation on that data is guaranteed to be correct'

But if you restrict the expressible forms of data, you introduce data loss. Data is what it is. It is primary. *Data is why we have computers*. Computation needs to be seen as a secondary thing that MUST adapt itself to the shape of the data - not the other way around.

@alcinnz @codesections

And this feeds into a political issue:

1. Users create data. Their needs must be primary.

2. Programmers create only computation. Their needs are secondary to those of the user.

3. The programmer - the app developer, the database administrator, the server host, the business manager, the IT oligarch - does NOT and should NOT outrank the user.

4. But today's models and mindset of 'correctness' and 'best practice' in both programming theory and business puts 2 over 1.

@alcinnz @codesections

It's basically the same problem that George Orwell saw when he satirised similar trends in linguistics as 'Newspeak'. The idea that you could 'eliminate error from communication' by strong definition and enforcement of allowable speech up front.

There will always be important data that end-users create that just doesn't fit the currently fashionable data schemas maintained by experts and big organizations.

We need to remember this and not pre-code such data as 'error'.

@alcinnz @codesections

This also folds into my unhappiness with the Object-Oriented Programming model:

1. OOP says that all data should be considered 'objects', ie, computing machines

2. Computing machines are 'dangerous' in various ways (halting problem, memory security, etc) and so they must be restricted

3. But if you insist on modelling data as objects, then you insist on modelling a simple and safe and general thing as a complex and dangerous and restricted thing

@alcinnz @codesections

4. Therefore, it would be better that we model objects and other computing machines (when we need to) as data, rather than data as objects. To do otherwise is to invert the abstractions, modelling a simple thing as a complicated thing, and that just invites errors of all kinds.

5. However, modelling objects as data violates one of the core rules of OOP: that objects be opaque 'black boxes' whose state cannot be accessed by the user.

6. There's a deep mismatch here.

@natecull
OO doesn't specify all data is built in an objects, only that abstract extractable data should be treated as such. Streams aren't technically objects persay
@alcinnz @codesections
Sign in to participate in the conversation
Mastodon

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!