*bart simpson at the blackboard voice*: we don't need better hardware, we need more efficient software, we don't need better hardware, we need more efficient software, we...


@djsundog Probably related: Even C isn’t as “close to the metal” like it used to be because no one understands the actual hardware. It’s all too complex, too proprietary, and everyone has too little time before the release cycle ends

@cypnk @djsundog no but seriously risc computers much more closely match the machine architecture that C was implemented for, but can also be put on massively multi-core chips for parallel applications. imagine a raspberry pi, but it's 128 risc-v cpus on a single board

@cypnk @chr @djsundog i heard if you open up a pentium 2 there are just a bunch of pentium ones inside

@cypnk @djsundog it's so weird how our current hardware is basically risc, but with a complex microcode-assisted operation transformation frontend so it can pretend to be cisc, plus crazy instruction scheduling stuff to workaround the fact that you can't compile targeting the cpu's real pipeline (there's so many different implementations that targeting one would be counter-productive anyways).

@kepstin @djsundog Not to mention a whole heap of additional structure and software hidden beneath. I read somewhere Intel’s Management Engine is basically an embedded MINIX 3

On a CPU. That’s... nuts

@cypnk @djsundog the management engine isn't really in the path of normal computation, tho. it's more or less taking the separate management chip that servers have (typically an arm or mips core running linux!) and putting it on the same die as the cpu.

(it does do weird stuff in the boot path to handle loading signed firmware, etc, but once the cpu's booted it's mostly independent)

remember, a "computer" is just a network of various big and small processors talking to each-other.

@cypnk @djsundog I've always thought the transmeta crusoe was a really cool piece of tech - instead of making an x86 instruction decoder, do software emulation of an x86 processor, and throw in jit transpiling to the cpu's native instruction set so it performs reasonably well.

@kepstin @cypnk I was convinced Crusoe was going to lead to revolutionary new processor designs. Welp,

@djsundog @cypnk my favourite bit of the crusoe was the fact that the native instruction set had no mmu or memory protection - that's all implemented in the emulation layer.

@djsundog @cypnk i'm kinda sad that nvidia never added x86 support to the implementation they made after buying transmeta's IP en.wikipedia.org/wiki/Project_

(also they kinda cheated a bit by having a hardware arm decoder in addition to the software dynamic recompilation layer)

@djsundog @cypnk the thing about it that improves efficiency, in theory, is that the conversion to cpu native opcodes can be cached for long periods of time in a large ram buffer (modern x86 cpus have smaller on-die micro-op caches), which means that it's a win to spend more time up-front re-optimizing for the cpu, instead of running scheduler & prediction tricks every time an instruction is decoded.

@djsundog @kepstin @cypnk Same. I honestly never expected the response to something better to be "we don't want something better."

@kepstin @cypnk @djsundog One of my professors once characterized x86 thus:

Imagine a field with a couple of trees. Cut them all down and build a cabin. Then demolish the cabin and use the remains to build a bigger one. Then demolish that one and use it to construct a small town. Then a small neighborhood. Then a large town. Then suburbs...

That's x86 right there.

The only way out would be to scrap the whole mess and start over. But then all the work would go into emulating the old shit.

@eldaking Um... yeah, kinda, now that I read about it. Somehow, I think Factorio has less technical debt in that regard.

@kepstin @cypnk @djsundog
> can't compile targeting the cpu's real pipeline
is that what Mill is trying to do?

@grainloom @cypnk @djsundog you'll need to give me some context, I don't know what "mill" is supposed to be here, and google searches are inconclusive.

@grainloom @cypnk @djsundog huh, that's interesting: "compilers are required to emit a specification which is then recompiled into an executable binary by a recompiler supplied by the Mill Computing company"

so code has to go through a machine-specific optimizer/translator before it can run at all.

@grainloom @cypnk @djsundog
An example of an cpu arch where compilers had to optimize for a specific pipeline was intel's Itanium - a fairly simple in-order vliw style core. it turned out that the compiler optimizers didn't get to the point where it was competitive with x86 until well after everyone had given up on it.

@kepstin @cypnk @djsundog they are aware of the Itanium's issues, AFAIK code generation to it is pretty straightforward. the belt is basically SSA in hardware.

@cypnk @djsundog Close to the metal is kind of a joke with modern architectures trying desperately to make the thing look like an 8086 to the assembler and compiler, when what's going on under the hood is a completely different beast.

@cypnk @djsundog How long do you think it'll be before the easiest to get compilers are those sold by the manufacturers, because they can generate the code-generating code straight from the VHDL?

@drwho @djsundog I think this is already happening somewhat, but for GPUs. I think this is what Nvidia CUDA is

@cypnk @djsundog ...I don't know. Have to look into it. And, perhaps, start weeping.

@drwho @cypnk @djsundog interestingly, processor manufacturers have been either forking or contributing directly to llvm so they only have to do the final code generation rather than the whole compiler.

AMD's ROCm platform for developing GPU applications is an example of that.

@cypnk @djsundog assembly isn't even "close to metal" due to all the nonsense intel is doing

@ben @djsundog Yeah, pretty much. All manner of gymnastics under the hood that we'll never get to see. I'm not entirely convinced there's a single person at Intel itself that knows all of it at this point

@cypnk @ben might be able to prove mathematically that it's impossible for such a person to exist.

@ben @cypnk and at the opposite end, you can't even start the cpu on a raspberry pi without asking the gpu to start it up for you. it's ludicrous in its overcomplexity through and through, no matter where you look.

Sign in to participate in the conversation

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!