A quote from a 2015 Slashdot interview with the creator of C++'s Standard Template Library, Alexander Stepanov, popped up in my birdsite feed:
<< I am still convinced that Simula/C++/Java style inheritance is unsound.>>
<<One of the biggest changes since then has been the growth of caches. Cache misses are very costly, so locality of reference is much more important now. Node-based data structures, which have low locality of reference, make much less sense.>>
I'm wondering particularly how, or if, Lisp cons cells (which are the maximally non-local kind of 'node structure') can be made to work better with caching, while still preserving the simplicity and generality that makes them attractive.
Especially across the Internet, where things like proxies and file systems count as one of those many layers of cache.
@natecull I seem to recall that Symbolics supported vectors of elements (CDR-coding, I think they called it?).
It's not as good as, say, a language in the APL family, though; I think K presents the best possible compromise between array-based and LISPy languages. (Although contemporary versions of APL seem to support K-like ability to hold non-uniform array elements.)
@vertigo Yeah. CDR-coding using, say, the lowest one or two bits (as you'd get in eg 32-bit words) appeals to me.
Lowest bit can mean '0 = jump, or 1 = call'.
Then: 0000 means 'jump to address 0, ie, NIL' which means 0 is a perfectly good 'sentinel' value to end a list with.
0001 means 'call NIL', ie, 'include NIL as an element in the list'
A nice thing about this is that we can have cells which are *just* jumps to other addresses, for patching or just optimising memory layouts.
@vertigo Also: yeah, the APL/K/etc vector type processing probably is really unhappy with list structure because it's just a very different set of assumptions about computation.
List structure assumes that you have ragged, unstructured data that you NEED to process carefully, one piece at a time.
Array structure assumes that you have 'dumb but big' data - tightly structured vector-field data that you want to process rapidly in parallel. Physics simulations, corporate databases, etc.
@natecull Slight correction -- APL1 and J are like this, but APL2 (and its progeny) and K are not. Like Go, these languages equate a vector and a "list" (e.g., they can hold aggregates of non-uniform data).
If you look at the primitives offered by K, you'll find that it's really just a very compact Lisp.
That said, you can still do ragged computing in APL/J as well, though you must box your data first, losing some of its performance benefits.
@vertigo Hmm. Do Go/APL2/K list/vectors allow for Lisp 'dotted pairs'?
which are a bit nasty and force unnecessary and unwanted sequential-ness on lists, but also let you do some weird and useful things that can't quite be expressed any other way.
I'm kind of torn as to whether dotted pairs are good or evil or just chaotic neutral.
I feel like we kind of need both lists with them, and lists without them. I think only the second kind can be parallelised.
@vertigo ... actually I suppose a dotted-pair list COULD be quite easily parallelised... just store it as a pair of (vector,scalar)
hmmmmmm
there's probably no deep connection there to quaternions and other hypercomplex numbers, that always have to have a weird extra scalar to add to the sensible vector part
probably
@natecull Depending on your specific needs, a dotted pair is just a vector of length 2. Was there a specific use-case you had in mind where this wouldn't work?
@vertigo A dotted pair is indeed a vector of length 2, but a list of length n constructed out of dotted pairs cannot be represented as a vector of length n.
because it takes n+1 elements to represent it - the +1 is the final CDR.
Most 'vector/array representations of lists' just assume that that final CDR is always NIL - ie, that it's a *proper list* not an *improper* (dotted) list - and don't bother to store it.
@natecull To amortize the overhead of repeated realloc() calls, most vector languages actually over-allocate their vectors, so unless memory gets tight, you have as many CDRs as you need. ;)
I disagree with your claim that a vector of length n cannot replace a Lisp list. The +1 can come from anywhere: be it a terminating NULL pointer, or a field elsewhere which provides an explicit array length.
@natecull In APL and J, arrays always have a length field. In Go, a length and max capacity field. They definitely can be used to replace Lisp-style lists. J even has head and tail operators.
@vertigo Certainly that +1 can come from anywhere. But it's something different to the n elements of the vector. It needs to be tracked and managed separately.
It's an n+1th element. That's the point. It's a non-vector construct. A vector, as a vector, can only represent a proper list. An improper list is bigger than a proper list.
A logical place to put this n+1th element might well be the 0th element of a vector allocated from 0 to n+1. But you can't just put it IN the n elements.
@vertigo This is important to me because I've been thinking about a whole class of data structures based on exploiting the improper-ness of improper lists.
One use case is for representing structures like relations, or unions of logical assertions, where you might have:
(A B C 1)
(A B C 2)
and you might like to save space by compressing this into a structure like
(A B C | all (1) (2))
That '| all' part isn't *quite* a dot but it's semantically similar; it's not vector-y.
@vertigo so, you might want to split this maybe into two chunks:
a pure proper list or vector [A B C]
the 'tail' of that list (| all (1) (2))
then eg if you wanted to access 'element 3' you just look up 3 in the vector and get C
but if you wanted to access 'element 4', well, that's trickier; there is no element 4; there's 'CDR of element 3', so sort of 'element 3.5', but then semantically that implies TWO 'element 4s': 1 and 2.
@natecull Well, to be fair, that's not a list, proper or improper. That's a tree. :)
@vertigo And yet, I've represented it as a Lisp list, so it literally *is* a list.
@natecull I suppose, but now we're into "syntax doesn't matter; it's all semantics" territory.
@vertigo Semantics is just another kind of syntax, in an enclosing system...
But this is the whole thing about this very fine and subtle but VERY important distinction, and why if you take any given Lisp program and try to map it lists-made-out-of-pairs onto a language that only offers you arrays or vectors, it will likely explode in a fiery heap - but only *sometimes*, in the case when it really needs that extra trailing cell.
It's super annoying actually. Our data structures *almost* map.
@vertigo Vectors are probably a whole lot nicer really, because you can random-access them and access elements or compute in parallel and you can get the best use out of your storage and cache.
On the other hand, pure pairs are easier to allocate? At least that's the PicoLisp guy's argument for why he doesn't support vector types.
I figure a pointer to a vector cell needs to be maybe two address words wide:
* pointer to the start of the vector structure (including length)
* index
@natecull The only reason they're easier to allocate is because they're fixed length. However, allocating arbitrary vectors isn't terribly hard either. I've written plenty of decent quality memory allocators myself. With a buddy-system allocator, you end up over-provisioning vectors almost for free, which further amortizes allocation cost into noise.
For reference: Go's "slices" (vector reference) are three words long: pointer to the memory, current length, and maximum capacity.
@vertigo I wonder what if we had storage allocated in super fixed size pages (1K, 2K, something like that) and then a pointer would just have X bits for the page and Y bits for the index
and you could always look up in the page and get all the metadata-y stuff....?
It seems like surfacing some notion of 'page' might be useful for thinking about caching, filesystems, network packets etc...
@natecull We had that on at least two occasions in computing history. Early computers with wide data paths would often store a base pointer in the upper-most bits of a word, and an index or offset in the lower bits. IIRC, this is where CAR and CDR come from (the two halves of an address register).
It happened again when segmentation came into existence. However, most people no longer enjoy using segmentation.
@natecull @vertigo that sounds like a BiBOP (Big Bag of Pages), a memory organisation has popped up many times in the history of Lisp - indeed, at least one Scheme implementation noted for speed (Chez Scheme) still uses a refinement of it. it means, among other things, that you don't need to lop off bits for type tagging - it's always implied in the address.
and of course, it works quite nicely with virtual memory if the page sizes match - means you can relocate pages transparently.
@thamesynne @natecull I don't want to bash the author; he's doing something I've never been able to do (create a thriving and arguably healthy community around his creation). His decisions seems to work for him, and that makes me happy. My creations are also stuck in the past in many respects.
When I said PicoLisp has the advantage of always using fixed-sized cells, I was just intending to highlight that he has different design constraints than, say, a vector-native runtime environment.
@natecull I have to admit, I don't see the incompatibility, at least in the lower-level representation of the data structures. But if storing a trailing 0 serves as the nil sentinel, then it's not a huge loss. CONS-cell representations will consume more memory as a rule once lists exceed three elements.
@vertigo @natecull I've been thinking about this optimization in the context of "one reference only" implementations like newLISP. NewLISP doesn't have this optimization and has to do more copying than most lisps. I think you might be able to make ORO a lot more practical if you turned lists into vectors whenever possible. You could even avoid realloc by only doing it when you're already copying the list.
@natecull @vertigo Erlang "arrays" are trees with fixed-sized cells (somewhere around 8-12 elements IIRC). This makes append very fast while random access is log n. I can't remember if inserting in the middle is fast but I seem to recall the middle nodes weren't required to all be full. This is also a very cache-efficient structure.
@seanl @natecull By that same logic, you can use the same trick as what skiplists use to amortize their otherwise large pointer overhead, of embedding a small (but finite) number of elements in each node in the list, so access times are amortized. Say, each node is 64 to 128 bytes in length, minus pointer overhead. That sounds pretty close to what Erlang is doing at the leaves of its tree.
@natecull Yep. AmigaOS used a similar structure for "tag lists", allowing C code to provide optional parameters to system functions (e.g., like OpenWindowTags to create and open a new window on the screen).
Yeah, AmigaOS got Greenspun'ed when Commodore released Kickstart 2.0. ;)