Happy Midsummer's Eve / St. John's Eve, all Swedes, Swedophiles and Scandinavians!
@djsundog @balor Apparently there is also http://bjson.org/ . They list half of the formats I mentioned and also others, so I guess they must have figured they had really good reasons for creating Yet Another Binary JSON.
> Binc is a lightweight, compact, limitless, schema-free, precise, binary, high-performance, feature-rich, language-independent, multi-domain, extensible, data interchange format for structured data.

https://github.com/ugorji/binc
> Fast, compact, schema-less, binary serialization and deserialization oriented towards dynamic languages

> This format was started because the authors had technical reasons for producing a better Storable.

> Before we embarked on this project we had a look at various prior art. This included a review of Google Protocol Buffers and of the MessagePack protocol. Neither suited our needs so we designed this, liberally borrowing ideas from the other projects.

https://github.com/Sereal/Sereal
Gobs are interesting, partly because they sort of use an idea I had the other week (I hear Avro uses that idea too): Use a schema-dependent format, but one that starts by shipping the schema written in a bootstrap schema.

https://blog.golang.org/gobs-of-data
https://en.wikipedia.org/wiki/SDXF https://tools.ietf.org/html/rfc3072 calls itself self-describing, and I guess it is, but not even as much as RIFF or IFF is -- there is a container datatype and you have IDs for chunks, but the IDs are not arbitrary-length strings or even FourCCs, they're just a 16-bit blob.

I guess it could be made human-readably self-describing, and serialize JSON, if you define a chunk for naming chunk IDs.
Found an old (first public description, https://tools.ietf.org/html/rfc1014 , is older than the web) #cerealization format called XDR. Not self-describing. Latest RFC is https://tools.ietf.org/html/rfc4506 . Apparently it's used in e.g. NFS, ZFS, R and SpiderMonkey.

https://en.wikipedia.org/wiki/External_Data_Representation

Interestingly, it has a representation for quadruple-precision floats.
#sbe is not self-describing:

> To use SBE it is first necessary to define a schema for your messages.
I don't get at a glance whether https://dbus.freedesktop.org/doc/dbus-specification.html#message-protocol-marshaling is self-describing or not. And I don't get how dicts work. It would make sense to be following some schema, and they talk about some things being introspectable and some not.

Anybody know d-bus wire format?
@balor @djsundog In terms of total data transfer? I would assume XML if one doesn't insist on binary. After that JSON? After that maybe BSON, because all the hipster systems run on MongoDB? Otherwise ASN.1 because it's in a whole bunch of protocols, including TLS, but depends what one includes in "binary format for structured data". ASN.1 is not self-describing, both sides need to pre-agree on the schema.

And then there's bencode, powering the protocol driving most of the data use on many parts of internet.

The format people should probably be using if they didn't have any particular requirements is CBOR, because it's an RFC. The awesomest one is probably Cap'n'Proto, because it's the Second System by a person who helped make Protobuf and learned from its mistakes.

Of course some people use netstrings, because they're a DJB specification.
Show more
Mastodon

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!