More adventures with machine code
SubX now has a test harness, and support for string literals.
Current (increasingly silly) factorial program to compare with the parent toot: http://akkartik.name/images/20180923-subx-factorial.html
Two new features:
a) Autogenerated `run_tests` (line 26) which calls all functions starting with 'test_'.
b) String literals (line 31). They get transparently moved to the data segment and replaced with their address.
That isn't much progress for a month. I've been trying a few different things and learning a lot:
a) I spent some time looking at the GREAT stage0/Mes bootstrap project.
b) I tried to design a type-safe low-level language that could be converted line by line to machine code, but it turned out to be a bust (thanks @firstname.lastname@example.org!). It's possible if you give up on freeing heap memory.
I'm now convinced there are no shortcuts. Gotta build a real compiler in machine code.
There's only one problem: I don't know how to build a compiler. Not really. And definitely not in machine code. So I'm going to be fumbling around for a bit. Lots more wrong turns in my future.
I've been trying to port https://compilers.iecc.com/crenshaw to SubX. With tests. It's been slow going, because I have to think about how to make Crenshaw's primitives like `Error()` and `Abort()` testable.
I don't know if just learning to build a compiler will sustain my motivation, though. So some other ideas:
a) Some sort of idealized register allocator in Python or something. I've never built one, and my intuitions on how hard it is seem off.
b) Port Mu's fake screen/keyboard to SubX so that I can reimplement https://github.com/akkartik/mu/tree/master/edit#readme. It was just too sluggish since Mu was interpreted. Even my 12 year old students quickly dropped it in favor of Vim.
'Modules' in machine code
SubX now supports code in multiple files. Segments with the same name get merged.
One subtlety: later code is *prepended* to earlier code. Gives us executable libraries with overrideable entrypoints.
$ git clone https://github.com/akkartik/mu
$ cd mu/subx
$ ./subx translate *.subx -o out
$ ./out # run tests
$ ./subx translate *.subx apps/factorial.subx -o out
$ ./out; echo $?
120 # factorial(5)
$ ./out test # run tests
We also now have the ability to allocate new segments in virtual memory using mmap(). The first new segment I plan on is a 'trace' segment that tests can write to and make assertions on. Automated white-box testing: http://akkartik.name/post/tracing-tests
Next up, once I have some tracing primitives: a dependency-injected interface for sockets so that we can write automated tests for a fake network.
@akkartik @email@example.com @haitch @freakazoid Honestly, my advice here is to read two books: "Programming a Problem Oriented Language" teaches you how to write your own Forth compiler from scratch (bare metal on up). There are no assembly listings in the book, because it's pure guidance, but it was instrumental in me getting DX-Forth working at all.
Second, if you're willing to tackle the more sophisticated compiler techniques needed to have a proper infix compiler, I would then recommend reading Compiler Construction, by Niklaus Wirth (see http://www.ethoberon.ethz.ch/WirthPubl/CBEAll.pdf).
You don't need fancy register allocation to get a working compiler that meets 80% of your performance goals. Don't let that slow you down.
@akkartik @firstname.lastname@example.org @haitch @freakazoid
Next, I would strongly recommend reading https://www.cs.indiana.edu/~dyb/pubs/nano-jfp.pdf . This was instrumental in getting my BSPL compiler prototype to work, and I'll be revisiting this again when I revise BSPL once more to port VertigOS to the #Kestrel3.
@akkartik @email@example.com @haitch @freakazoid There seems to be this great fear of passes in compiler design. It's a one-pass compiler, or a two-pass compiler. Hogwash; do 56 passes if you must; all that matters is the final output. Register allocation can be one (or maybe a few) of those passes towards the end.
@vertigo That being said, if performance is a primary goal, I'd suggest considering whether simpler heuristics are applicable
@haitch @akkartik @firstname.lastname@example.org @freakazoid Nice thing about not shying away from passes in compiler implementation is that passes are often plug-and-play relative to each other. That allows you to follow Butler Lampson's philosophy: "Make it work, make it right, make it fast."
A lot of what goes into making a compiler produce the fastest possible code is an act of diminishing returns, and frequently has to be redone every processor generation. Ergo, the more you can modularize it, the better.
@vertigo Anyway, routines that juggle more than 32 variables at amy given time are possibly not very well thought out. And routines that use more than that shou;d probably be penalised and split into other functions. This is less of a problem for processors with many registers, but ai ser how it can be a concern for @akkartik who's specifically targetting IA32.
@akkartik @email@example.com @freakazoid
@AceNovo @akkartik @firstname.lastname@example.org I mention it in my follow-up toot, https://mastodon.social/@vertigo/100781598805099246 .
Server run by the main developers of the project It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!