"Why don't people use formal methods?" - I really enjoy Hillel Wayne making some of this stuff more accessible to "blue collar" programmers like me

The history on the SMT stuff seems a bit off but otherwise a good writeup on the matter I think

hillelwayne.com/post/why-dont-

@csirac2 having mixed feelings from a first read (confessing to hitting the link on @Kensan's :birdsite: link…): the fact that the SMT stuff is so off puts me off (excuse the pun)…

Follow

@cynicalsecurity @csirac2 I have to admit that I only skimmed it during lunch. To be read in full after work.

If you want a more in depth history of SAT/SMT there is this paper:
satassociation.org/articles/FA

h/t Dr. Heidy Khlaaf (@HeidyKhlaaf on birdsite)

@cynicalsecurity @csirac2 Also, here is the birdsite link:
twitter.com/kensan42/status/10

Sorry for doing this but I thought I would add it for reference.

@cynicalsecurity @csirac2 So after reading the whole article I think all in all it is a good summary of the subject. I was surprised that SPARK was featured several times. Most of the time people don’t bother with it because it is not deemed suitable for „proper“ full end-to-end proofs by itself. Never mind that it has been applied in the industry since the early 80s... but I digress :)

@Kensan @cynicalsecurity in my (admittedly limited) career: SPARK, Isabelle, Z3 & JML (if that counts) are only ones I've had colleagues w/some commercial exp. Yet to really play w/any of it in any dayjob yet :/ I'm happy if we can just enumerate & sanity-check dynamically-constructed (!) FSMs, or visualize RBAC policies, or even just know what 3rd-party stuff is in large systems, *or even document any security goals at all*. Anything that's not coding is often perceived as antithetical to agile

@Kensan @cynicalsecurity I have seen so many pointlessly wasted person-years poured into fundamentally flawed systems, especially distributed systems, one sprint at a time - when pretty much a few weeks of modeling the design in anything at all, formal methods or otherwise, would have easily shown the infinitesimal islands of stability

@Kensan @cynicalsecurity there is a certain background level of arrogance that I feel must be fought at all times, even in myself, which at its peak too often manifests as things like this: mobile.twitter.com/andy_pavlo/

.. and this is what I mainly want to use things like TLA+ for: to demonstrate hidden complexity and depth of raw engineering work hidden in many frivolous decisions, before we jump off those cliffs

@csirac2 @cynicalsecurity It is quite clever to argue that formal verification is so much more expensive. That statement is true if you only consider the time up *until* delivery/deployment.

Sure, you are not as quick hacking the code but then again, if applied properly/in a reasonable manner, you do not spend massive amounts of effort on debugging and fixing bugs in production systems. You kind of pay the price up front during development and don’t pass the bucket on to the users.

@csirac2 @Kensan let this toot be a placeholder for a far larger discussion.

Distributed systems are difficult.

Calling “the cloud” a distributed system is clearly just marketing because, in reality, you are simply offering a computing farm with a sexed-up batch queue: you purchase CPU time, you submit your job (which is now a full stack of crap) and it runs on the first spare vCPU available.

Tons of useless code developed to support this daft model.

A long time ago I used to dabble with…

@csirac2 @Kensan Linda¹, a parallel programming paradigm based on a tuple space into which you placed tuples of data for processing which were picked up by jobs. The tuple space made the underlying architecture invisible and, rather beautifully, meant that the data was central to the whole paradigm².

Imagine an AWS where instead of purchasing CPU and then running stacks of…
__
¹ en.wikipedia.org/wiki/Linda_(c
² the naming of Linda is a horrible story which sadly detracts from the idea :flan_sad:

@csirac2 @Kensan fashionable crap to replicate a proper server on someone else’s virtual machine, you actually submitted data tuples for processing by jobs advertising the capability?

Imagine a trivial example: need an addition? Send the tuple (add, 2, 3) into the tuple space and associate a payment form. A suitable job picks it up and returns answer & invoice.

This goes beyond unikernels into what looks almost like a global interpreter.

Do we need millions of copies of Apache? Nope :)

@csirac2 @Kensan in a different life I would have probably worked hard to develop this idea (which, I hasten to add, dates from the early 90s… I was doing a lot of parallel computing back then, including very early Beowolf clusters, hence my knowledge of the Don Becker EtherExpress saga…).

What is even more interesting is, given the tuple space above, how does one secure the data? Here the whole concept of attestation, as fashionably bombed onto us by Intel’s SGX, is the answer: you would need…

@csirac2 @Kensan to be able to, for example, attest that your addition service correctly processes data according to the GDPR and, similarly, the requestor could attest that he is capable of paying by providing an attestation of credit by a bank.

This is a long story and my wee dram of whiskey is finished… it should be expanded, perhaps by others. :flan_bow:

:flan_sleep:

@cynicalsecurity @Kensan Thanks for the Linda notes! Do you see any room for this kind of thing in Eg. golem.network ? I wonder if they've reviewed what's been tried before?

@cynicalsecurity @Kensan FWIW the "distributed systems" I've helped repair aren't necessarily cloud in the modern sense; my adventures mostly involve data spread across multiple sites, at times involving MongoDB, Cassandra, grotesque multi-master things bolted onto mysql/postgres, a bespoke distributed log thing.. the unifying feature is they all wanted high-availability & performance, but got the opposite: perf, consistency, integrity & availability end up inversely proportional to no. sites

@csirac2 @Kensan you see: that is not a “distributed system”. It is just a collection of services thrown around a network.

A proper distributed system is transparent, like Plan 9, where CPU servers ran jobs, visualisation servers gave you screens and file servers kept your data (and backed it up). You need not know where you data or your code is, you merely have a window into it via the visualisation server.

@cynicalsecurity @Kensan I guess I want to describe the producer/consumer things that these setups were serving and how their view of the world was supposed to be ignorant of how multi-site worked, but you're right it really was just a clusterfsck and describing it as a distributed system actually obscures the crimes the creators of those systems did to otherwise perfectly innocent 1s & 0s, even if the cascading stability & reliability issues were fascinating to me at the time

@cynicalsecurity @Kensan a colleague on the most memorable of these once described the whole thing as "a very expensive experiment we didn't need to do"

@csirac2 @Kensan probably a suitably polite definition of “clusterfuck” if ever there was one.

What is remarkable is how we are migrating systems to clusters of clusterfucks based on hardware which has no reliability characteristics (PCs) and software held together by spit & sellotape…

@cynicalsecurity @Kensan If I'm honest I can't remember a paid gig where that wasn't the case; what seems to be accelerating the entropy is fewer folks (me included) understanding layers immediately adjacent to or within their scope of work. Just recently saw in a moderately famous codebase w/code-review process, in an area configuring networking, it shells out to `cat /proc/.. | cut -f2 -d' '` instead of calling uname(), in compiled userland code that has no business doing that in 1st place

@cynicalsecurity @Kensan anyway thanks for allowing me to indulge in this cathartic thread :D I will try to post about things that matter one day

@csirac2 @Kensan But these *are* things tha matter! Understanding the weaknesses and pointlessness of current designs is essential to avoid the self-fulfilling, self-celebrating cabal of “the Cloud” getting away with it.

@cynicalsecurity @csirac2 We package the fucked cluster into a package (aka. VM or container) that everyone can easily download. From there we go level up and repeate. It’s babushka dolls of clusterfs ;)

@csirac2 @Kensan statistically anything older than five years did not happen in modern IT… :flan_sad:

@cynicalsecurity @csirac2 I don’t know what was going on five years ago :D Oh wait! We had just published Muen a few months ago :p

@cynicalsecurity @csirac2 It really is tragic, that even virtual machines have a BIOS with SMM and everything (there is a compile-time option to disable it in SeaBIOS btw) just because both sides, hypervisor and guest OS/firmware, insist on implementing the “physical” interface. I know, I am rambling on about a tiny technical detail and I don’t really know “cloud” and how the big providers do it so please forgive my bad mood and chalk it up to my ignorance...

@Kensan @cynicalsecurity This has always profoundly offended me! This is why I've been such a xen head over the years; partly because it was always there, but also partly because we can boil away that crud if we want. I might be exposing my ignorance here, I'm burning out in a very reactionary/response-driven job atm and can't plan research time like I used to, but I still wonder what's so revolutionary about Firecracker other than being a modern take on PV/cruftless-boot that xen pioneered

@Kensan @cynicalsecurity (on x86 at least, I think: I assume you could both point me to earlier PV shenanigans than Xen :D) I've not been reading many papers about virtualization for the last year, but I'm still waiting for treatments of KVM equivalent to really awesome papers like this: trustkernel.com/uploads/pubs/n "Deonstructing Xen", where they analyze flaw history, attack surface, architecture and logically put it back together again... Linux/KVM is a far more difficult monolith

@csirac2 @cynicalsecurity Xen was really interesting since it really accelerated the broad adoption of virtualization. It was in 2003 with no ubiquitous hardware virtualization support in sight on Intel’s x86 platforms. Nowadays it is clear that Xen is lugging around all that baggage of a +15 year old code base. I think Xen worked well enough and nobody cared about disaggregation of the systems that were built using it.

@csirac2 @cynicalsecurity I wanted to take a look at Firecracker but just can’t find the time. Afaik it’s a clean slate implementation of a VMM in Rust with significant backing.

@Kensan @csirac2 I personally think Xen is long past its sell-by date. But today is Saturday and I have better things to do than re-design virtualisation :D

@cynicalsecurity @csirac2 He goes into quite a bit of history and gives some perspective. Recommended.

@Kensan @cynicalsecurity awesome stuff, thanks for the link!

Xen certainly is showing its age, it's just... amazingly hackable. Like the perl of hypervisors :-) I guess that's why still so many experiments are still being done with it. I'm just skeptical that Linux/kvm is much of a step forward in attack-surface/complexity.

@csirac2 @cynicalsecurity KVM is a type-2 hypervisor and thus fundamentally different in that the hypervisor is directly lumped up with an entire OS kernel.

@cynicalsecurity @csirac2 What can I say other than ¯\_(ツ)_/¯

Even the idea that you can build performant virtualization systems with one VMM per VM is still a foreign concept...

@csirac2 @cynicalsecurity Well as I wrote in the toot: go thank Heidy Khlaaf, I am just parroting what I picked up through her tweet.

@csirac2 @cynicalsecurity The great thing about SPARK and Ada is that you don’t have to do full formal verification for the entire codebase. It can be applied very fine-grained e.g. on subprogram granularity. Also you can just start out with data flow and move on to absence of runtime errors etc later. Just determine which part of your system would benefit the most and start there.

Sign in to participate in the conversation
Mastodon

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!