Follow

One of the fascinating things about object-oriented programming is how we freely mix up the ideas of 'interface' and 'behaviour' and think that by having specified an interface we have specified a behaviour.

Which is a bit like mixing up the ideas of 'I cut the red wire' and 'the bomb doesn't explode'.

Some important detail is missing in between.

(quote from an otherwise wonderful essay that I probably agree with, but, *an interface is so very much not a behavior*)

tedinski.com/2018/01/23/data-o

I grumble about this because I am a system administrator, not a coder, which means that my entire day job revolves around taking an 'interface' provided by a manufacturer which is 'supposed' to do some behaviour... and *then* trying to reverse engineer what *actual* behaviour results from that interface and apply the appropriate workarounds so we get the actual behaviour WE need.

Interface is never behaviour.

Manufacturers always lie.

Never trust an interface.

From my perspective, coders live in a kind of abstracted, coddled little bubble where they are given a bunch of interfaces which - by pure chance - sometimes HAPPEN to have behaviour that loosely resembles what's written in spec.

This is almost never the case at the wider level of application installation and scripting.

Oh, the things I've seen in corporate installers, you would not believe.

Good lord.

<< An anecdote: ever see generated code in Java try to store a blob of data? Say, an array of integers? If you just have a static array, well, that turns into bytecode that allocates an array and one-by-one assigns each element to the correct value at class load time, and can quickly exceed the bytecode size limits for a method. >>

<< There isn’t (yet?) a format for just any kind of data in .class files, so you end up encoding the data as strings and then concatenate and decode that string at runtime.>>

As a comparison: BASIC, in 1977, had a 'DATA' statement for inline data (was pretty dumb, just a big array of integers and strings, but still).

@natecull

One might argue that it's the case in *correct* code.

If you consider the interface to be a (non-machine-enforceable) contract, then either interface is the same as behaviour or your code has a bug.

@suetanvil Right, the big problem here is that our programming languages almost never bother to actually define behaviour, while investing vast amounts of effort in defining interfaces.

Which is quite amusing. And very, very terrifying.

Behaviour certainly CAN be formally defined, with things like contracts and assertions. We just don't usually do it, not at the level of obsession that we do for 'do the wires have the same shape plugs'.

@suetanvil By 'behaviour' I mean things like:

* side effects - what on earth is that OOP object doing while it's processing your method call? What method calls is it making, and to where? Do you know? Does your language runtime even care? Do you have any way of finding out?

* state change - what has changed in the object itself after your method? How will new method calls change? Does encapsulation mean you're even allowed to know?

* Substructure - what's IN that Type X object it gave you?

@natecull

Programming is a primarily human behaviour. Code is written by humans FOR humans (the compiler only cares about a subset of it).

The purpose of an interface is primarily to tell the human(s) looking at the code where to find the document that describes a correct implementation. And that document may well say, "whatever feels right.".

Consider,

X::zzz(Y)

vs

Document::printOn(Display)

It makes no difference to the compiler but the second promises specific behaviour.

@natecull

(And then, there's the question of

1. Can you specify correct behavior that isn't just writing the function?

2. If you can, is it going to be more robust than just writing the function?)

@suetanvil The answers to both 1 and 2 are 'Yes!'

If we were talking about, say, screws rather than code, your argument here would be the same as saying:

'But can you specify a screw size that isn't just building a screw by hand?'

'And even if you can, is it going to be more robust than just building a screw and fitting by hand?'

And the answer is Yes! Yes it is! We can! We do! It's easier to measure a screw than to build one! That's how we got the Industrial Revolution! By inventing specs!

@suetanvil

The important idea here is that the specification of a behaviour does not have to be the *complete* behaviour... just a correct subset of it.

In much the same way that a map does not have to be an exact full-scale 1:1 replica of the terrain, but does have to be a correct representation of it.

But we have to start by even *trying* to describe behaviour in a declarative way, which we currently mostly aren't.

(The testing movement is getting there, but it can't test negatives).

@suetanvil Real-world examples of 'behaviour':

* using a firewall to impose a negative behavioural requirement that packets not be sent to certain places

* using Wireshark to see what packets are *actually* being sent when a program executes

* taking before and after snapshots of the filesystem and registry (on Windows) to see what local state changes are *actually* made when a program executes

It would be nice to have stuff like this at the message-and-object level of a language runtime.

@suetanvil I suppose testing harnesses are sort of evolving in this direction, but it seems to have been a very slow, parallel, evolution in the two worlds of devs vs ops.

And if we had some way of guaranteeing ahead of time (through code analysis or something) 'what the object can't do', that might be nice.

Eg: this random object foo we got off a github: can it write to the filesystem? Can it send TCP/IP packets? Can we please prove that it can't do either, before running it?

@suetanvil (Capability systems probably improve this situation enormously... our current OO scripting and application languages tend to provide one huge, massively-over-authorised object like 'System' to all programs, and maybe it would be easier to prove security properties if we didn't do that.)

I have two problems with your screw metaphor.

Firstly (and less importantly), screws are not complex. I can functionally and unambiguously describe a screw in one line of text via ISO standards. A better example would be "large city's transit system".

Software is *complex*.

Secondly (and more importantly), we tend to think of source code as the final artifact because it is the main product of human labor but that's not really the case. Source code is the blueprint for the executable. We think of it as the product because the "manufacturing process" is effectively free but it's really the design stage.

So you shouldn't be talking about the screw. You should be talking about the *blueprint* that I send off to the screw factory.

When I say I can specify a screw in one line, I'm invoking a number of ISO standards and probably a bunch of other engineering descriptions or possibly just using a well-known part number. In other words, I have a screw-description library to call upon.

That library is itself the result of hundreds of years' worth of trial-and-error human labour where some of that error was *extremely* expensive in both money and lives.

(And even with all of that at my disposal, I can still get it wrong and buy two million incorrect screws.

Leaving me... screwed.

•_•)
( •_•)>⌐■-■
(⌐■_■)

)

Your examples (firewall rules, packet analysis, filesystem change) are also telling, in that they all require actually running the code to see what it does.

That is to say, building the bridge and then running a series trucks over it to see if it collapses. If it doesn't after enough trucks, we assume it's safe (but then again, have we *really* tried a typical sample of trucks or will one come along years from now and break everything?)

(Yes, I'm switching metaphorical artifacts here. Sorry.)

Consider: what is the number of legal filesystem state changes a word processor can perform? The answer is, too many to count.

So we have to subdivide.

The operating system restricts it to certain files, directories and registry keys. The API asserts certain broad constraints and the unit tests try out likely bad cases.

And all of this helps, but it doesn't *guarantee* anything. We're still just sending more trucks over the bridge.

It's similar at the language level. We can't limit a language to only correct behaviour and still have a useful language (as Turing proved) so the best we can do is impose broad constraints at compile time ("this variable is an integer") or runtime ("you can't write past the end of an array").

But ultimately, the most effective way to determine if a function provides correct behaviour is verification-by-human. This is expensive and error-prone but there's nothing better.

(That being said, there's some interesting stuff going on in the whole automating-this-stuff space.

C++20 will have Contracts (like Eiffel) and Concepts to let the developer add specific constraints that the compiler can (sometimes) enforce.

And automated formal verification is a thing. This C compiler

compcert.inria.fr/

is formally verified correct.

Also, static analysis (which currently pays my bills) is a growing thing.)

But this stuff isn't a solved problem.

(Also: I found writing about this really useful as a way to learn to articulate my unease with this subject. So thanks.)

@suetanvil I think this fails too. Code is instructions for the behavior of a system. There’s a way in which specifying what the behavior should be and what it is will bleed together in software. Tests and contracts and stuff are important, but they are cut from the same cloth as the code in a way that doesn’t really have a good analogy in any other domain that I know of.

@vector

This is more or less what I've been saying.

The concept of "correct behaviour" is a human thing that software can't validate. Tools will help but that's all they do.

In light of that, an interface is a contract from one human to another outlining what their code is allowed and/or expected to do.

@natecull Or just store everything in files or databases, where it should be anyway. Java got a lot of things wrong, but the .properties file is excellent.

@mdhughes

<< Or just store everything in files or databases, where it should be anyway. >>

That's certainly the Java approach, but I personally feel it's the opposite of the right direction.

I would order the sensible places to put constant data (best to worst):

* inline, next to the code that uses it

* separate data files in an unrelated filesystem, maybe not even checked into the code version control system, requiring file access rights

* databases, requiring yikes! complexity

@mdhughes Databases, especially, are the worst offenders for making code not readily composable. The amount of nonsense you have to jump through to install a database, create a table, create a user, load it with data, set up some data export/backup system...

All for something that in a data-friendly language, could be one variable assignment statement at compile time.

@natecull This is a platform/tools issue. Making a Core Data database full of stuff is super trivial on the Mac/iOS, it's only slightly harder than making a spreadsheet. On Windows there's Access for that kind of thing, just as easy (other than all their tools being kind of awful).

It's not that hard with SQL and a script to initialize the database, but then you have a server to manage and all that shit.

But a text file is ridiculously easy, and in Java you can store them in a source package.

@mdhughes I guess I'm thinking about Javascript (Node.js) here as a baseline, where you can just include a .json file from the source directory... but if you want to read a text file you have to include an entire general-purpose file-system management module, with all the security access rights that requires.

It seems like at least an order of magnitude complexity (and security risk) jump from one to the other.

@mdhughes with, for example, the fact that the Node.js filesystem module *simply does not exist* in the web browser environment, but the 'include .json file' mechanism does. And we might want a function to run, unchanged, written in the same language, in both places.

There seems no particular reason why a lot of the code we will be using in the future will assume an entire Posix environment for each runtime.

@natecull I have a similar thing for Node/Electron environments, but then have to make an AJAX call and callback to read text or sound files if they ever go back to the web. But the web's got so many of its own problems with reliability, I don't treat it as an application platform anymore.

@mdhughes Right. That's a very sensible approach given the restrictions of our current systems.

But if we were stepping back a bit and looking at what we COULD have, not what we DO have: I think we could gain a lot by wrapping the idea of 'program' and 'resource' and 'configuration' all into one language. A language that could describe data as well as code.

I think it would be nice to have one object that's everything needed to run a program. And which we could modify with new configuration.

@natecull That's one of the attractions of Scheme for me. Data files are just sexpr. But I keep them in separate files to keep binary size down.

@natecull Yeah, JS is a very different thing. In Java, I have a utility function:

public Properties readPropertiesResource(String path) throws IOException {
InputStream is = getClass().getResourceAsStream(path);
if (is == null) {
throw new IOException("Missing resource '"+path+"'");
}
Properties props = new Properties();
try {
props.load(is);
} finally {
is.close();
}
return props;
}

And I have similar things for other file types. Repetitive but you only do it once.

@natecull You are not supposed to look. Otherwise the magic will escape.

Sign in to participate in the conversation
Mastodon

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!