Simplified Molecular Input Line Entry System (SMILES) can define the shape of molecules in short ASCII strings. Invented in the 1980s, an open standard in the 2000s.

something I didn't know about until reading

SMILES is interesting because it's an example of a terse, human-optimised, ASCII representation of 2D or 3D structures - made in the 1980s - while the 'visual programming' movement around the same time was claiming such a thing either wasn't possible or wasn't important, and that we should interact directly with visual models of software and generally didn't put any thought into designing a serialisation format.

A line of thought that even today, crops up among visual programming advocates.

The rest of the programming community, of course, naturally ignored the visual programming people (with the exception of IDE designers) and just kept writing ASCII, textual, line-oriented formats. Because they work, and because the dimensionality of the structures we work with in programming is a lot higher than 3D.

But we could still improve them a lot, and also improve the mapping between textual and graphical descriptions of multidimensional structures. SMILES shows us one way.

@natecull Some of those SMILES give a vibe of regular expressions for chemists. The sort of thing where after a while working actively with both the described material and the descriptor, your brain twists around enough to be able to parse and produce it linearly well enough in the moment.

@natecull (Also, the opaqueness I'm feeling reading those now and comparing them to the reference imagery is similar to my early regex memories and how I've heard others describe some of my worse regexes.)

@bb010g @theoutrider 'regexes for chemists' sounds like it describes both the upside and downside of SMILES

Sign in to participate in the conversation

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!