Factorial on SubX


Ok, I think I understand calling conventions now.

Also coming face to face with the pain of debugging machine code 😀


More adventures with machine code

SubX now has a test harness, and support for string literals.

Current (increasingly silly) factorial program to compare with the parent toot: akkartik.name/images/20180923-

Two new features:

a) Autogenerated `run_tests` (line 26) which calls all functions starting with 'test_'.

b) String literals (line 31). They get transparently moved to the data segment and replaced with their address.


@haitch @vertigo @freakazoid @kragen@nerdculture.de

That isn't much progress for a month. I've been trying a few different things and learning a lot:

a) I spent some time looking at the GREAT stage0/Mes bootstrap project.

b) I tried to design a type-safe low-level language that could be converted line by line to machine code, but it turned out to be a bust (thanks @kragen@nerdculture.de!). It's possible if you give up on freeing heap memory.

I'm now convinced there are no shortcuts. Gotta build a real compiler in machine code.

@haitch @vertigo @freakazoid

· · Web · 3 · 1 · 2

There's only one problem: I don't know how to build a compiler. Not really. And definitely not in machine code. So I'm going to be fumbling around for a bit. Lots more wrong turns in my future.

I've been trying to port compilers.iecc.com/crenshaw to SubX. With tests. It's been slow going, because I have to think about how to make Crenshaw's primitives like `Error()` and `Abort()` testable.

@kragen@nerdculture.de @haitch @vertigo @freakazoid

I don't know if just learning to build a compiler will sustain my motivation, though. So some other ideas:

a) Some sort of idealized register allocator in Python or something. I've never built one, and my intuitions on how hard it is seem off.

b) Port Mu's fake screen/keyboard to SubX so that I can reimplement github.com/akkartik/mu/tree/ma. It was just too sluggish since Mu was interpreted. Even my 12 year old students quickly dropped it in favor of Vim.

@kragen@nerdculture.de @haitch @vertigo @freakazoid


Register allocators are pretty fun - some heuristics applied to graph coloring, which is quite hard.

@haitch @vertigo @freakazoid @akkartik

@akkartik @kragen@nerdculture.de @haitch @freakazoid Honestly, my advice here is to read two books: "Programming a Problem Oriented Language" teaches you how to write your own Forth compiler from scratch (bare metal on up). There are no assembly listings in the book, because it's pure guidance, but it was instrumental in me getting DX-Forth working at all.

@akkartik @kragen@nerdculture.de @haitch @freakazoid

Second, if you're willing to tackle the more sophisticated compiler techniques needed to have a proper infix compiler, I would then recommend reading Compiler Construction, by Niklaus Wirth (see ethoberon.ethz.ch/WirthPubl/CB).

You don't need fancy register allocation to get a working compiler that meets 80% of your performance goals. Don't let that slow you down.

@akkartik @kragen@nerdculture.de @haitch @freakazoid
Next, I would strongly recommend reading cs.indiana.edu/~dyb/pubs/nano- . This was instrumental in getting my BSPL compiler prototype to work, and I'll be revisiting this again when I revise BSPL once more to port VertigOS to the .

@akkartik @kragen@nerdculture.de @haitch @freakazoid There seems to be this great fear of passes in compiler design. It's a one-pass compiler, or a two-pass compiler. Hogwash; do 56 passes if you must; all that matters is the final output. Register allocation can be one (or maybe a few) of those passes towards the end.

@vertigo That being said, if performance is a primary goal, I'd suggest considering whether simpler heuristics are applicable

@akkartik @kragen@nerdculture.de @freakazoid

@haitch @akkartik @kragen@nerdculture.de @freakazoid Nice thing about not shying away from passes in compiler implementation is that passes are often plug-and-play relative to each other. That allows you to follow Butler Lampson's philosophy: "Make it work, make it right, make it fast."

A lot of what goes into making a compiler produce the fastest possible code is an act of diminishing returns, and frequently has to be redone every processor generation. Ergo, the more you can modularize it, the better.

@vertigo Anyway, routines that juggle more than 32 variables at amy given time are possibly not very well thought out. And routines that use more than that shou;d probably be penalised and split into other functions. This is less of a problem for processors with many registers, but ai ser how it can be a concern for @akkartik who's specifically targetting IA32.
@akkartik @kragen@nerdculture.de @freakazoid

Sign in to participate in the conversation

The original server operated by the Mastodon gGmbH non-profit