#BabelOfCode 2024
Week 2
Language: Forth
Confidence level: Low
PREV WEEK: https://mastodon.social/@mcc/113743302074837530
NEXT WEEK: https://mastodon.social/@mcc/113867584791780280
RULES: https://mastodon.social/@mcc/113676228091546556
So today's challenge looks *absurdly* easy, to the point I'm mostly just suspicious that part 2 will get hard. I figure this is an okay time to burn Forth.
I'm wanting to save Fortran for a week I can use the matrix ops. This puzzle looks suspiciously like part 2 will turn into a 2-dimensional array problem.
I *think* I'm doing this in pforth, for the simple reason that gforth, uh, isn't maintained anymore it seems, and so got dropped out of Debian Testing (which I have now)? I *think* I'd be *happier* using RetroForth, which is a "modern" Forth, but I guess it's better to learn the standardized, ANS Forth first. Even though everyone hates ANS Forth? Including the inventor of Forth…?
My biggest fear is there appears to be no way to read numbers written in ASCII from a file. We're predating ASCII
First problem I hit is comments don't work. The documentation specifically says text in parenthesis are comments, but it isn't accepted.
After some staring at the docs, I realize in all the examples, there are spaces. It turns out (Comment) is not a comment, but ( Comment ) is a comment. Because ( isn't a pure operator built in the language, rather there's a FORTH word ( that eats all words until ) is found. Holy crap. I never thought I'd say this but maybe it IS possible to self-host too hard
So this documentation is kinda very bad!
Lacunae I have noticed:
- They define a `KEY` operator for taking a character from STDIN, but don't explain what happens if `KEY` receives an EOF (experimentally: I seem to get a -1?)
- They explain a special syntax `CHAR n` for inserting the ASCII value of n directly into the code, but don't explain how the fuck you're supposed to represent the ASCII value for a non-character symbol such as a space or newline
More pforth documentation horrors
- The pforth tutorial is not a tutorial for pforth but rather a general forth tutorial, and therefore hedges itself frequently. For example, notice this section where it explains that "many forths" have a CASE statement. "Many forths"? What about THIS forth I'm reading the documentation to RIGHT NOW?
- ABORT not documented. The documentation lists it as a reserved word but not what it does
Wrote a version 1.0 of my program, testing it. All I've got so far is the character parser, reading the numbers in and decoding ASCII and that's it.
This is… bad. I would describe this as bad behavior for a programming language interpreter
The forth interpreter isn't completely busted, my test.f screenshot above worked. A simple "echo ascii values" program I wrote ( BEGIN KEY DUP . CR 0< UNTIL ) worked. But my 32 line, mildly more sophisticated program just… signal 11s. I do not know how to proceed. I am using "Debian Testing", which is TECHNICALLY a beta OS, so maybe the pforth is broken *subtly*. gforth isn't in dpkg. I don't… I don't know what to do next if the compiler crashes.
Maybe I ssh into a VPS and run gforth there? :(
So here's my *current* code, which crashes in pforth:
I run it in gforth:
cat "data/sample-2.txt" | gforth src/puzzle.f
I get:
in file included from *OS command line*:-1
src/puzzle.f:13: Interpreting a compile-only word
>>>BEGIN<<< ( Line )
Backtrace:
$7F5C6D55EB30 throw
Line 13 is indeed the word "BEGIN". According to the tutorial, that is how you open a UNTIL loop.
fuck the in What?
Okay. So some updates.
It turns out loops (BEGIN..UNTIL) are a "premium" Forth feature and are only available inside functions. So I need to wrap the whole program in a : function ; . Well not the whole program, not the VARIABLEs, and I don't think you can nest the functions, and… never mind. I do the nest. New code:
https://github.com/mcclure/aoc2024/tree/85890d80d89ebe82df779e20607f78cd4275f8db
gforth fails with :
src/puzzle.f:38: Invalid memory address
I wrapped my program in a : run [code here] ; run . Line 38 is "run".
I'm still lost.
A thing worth noting here is if you read my posts carefully above, you'll find I successfully executed a BEGIN .. UNTIL program in pforth. So pforth just relaxes the requirements of gforth. I don't think I've hit what is causing pforth to crash yet. I'm just trying to satisfy gforth's requirements for running the software at all. Maybe I should have read gforth's manual instead of assuming pforth's is adequate :(
This raises an interesting problem. I could have used a "nice" Forth like RetroForth or Factor(?) but I wanted to learn ANS Forth before I moved on to specializations. However, now I realize there are *only* specializations. Pforth is apparently giving me all kinds of niceties, the premium DLC is included at the toplevel. And I know for a fact gforth (of course, because that stands for GNU Forth) contains GNU extensions. So there are two standard Forths in Linux, neither actually standard.
Okay. So this explains my segfault! I can now run in pforth.
https://xoxo.zone/@clarity/113783794022449133
I accidentally wrote on one line `partial TRUE !` ; correct would be `TRUE partial !`. All variable assignments in Forth are done with pointer dereferences, so the wrong order was like trying to write to memory address TRUE (-1). It is as if C++ allowed you to write "true = x;" by accident instead of "x = true;" and write the address of x to 0x1
I am unblocked, but still can't do jack shit in gforth
Sorry, I kinda disappeared in the middle of asking a question. Me and @spookysquid were doing something extremely normal
@mcc the same thing goes for strings, at least in a lot of forths I've used
@mcc Forth is from an age when even macro assemblers cost money (and didn't support your arch anyway). Having a medium-level REPL you can write in a few hundred instructions is nothing to sneeze at. Does make for some weirdness though... I can't wait for you to make it do matrix ops.
@mcc Same for " for strings. That and immediate words and compilation mode and words defined in assembly/machine code vs the threaded interpreter and you've basically got the language =)
@mcc I've never looked at pforth but some pretty much define nothing. It might also define \ as to end of line comment.
@mcc oh oh. If you’re still open to language ideas and you want to get a sense of what a forth-like language would feel like if it were “modern”: have you heard of https://factorcode.org ?
It’s as if Common Lisp, Smalltalk, Forth, and Haskell had an unholy union. It was developed by two of the Swift devs!
@mcc I mean they’re very different: Factor will definitely feel more familiar and probably easier to set up your environment for, and it’ll have overall better practical library support for common real world tasks. It also does a lot of“dataflow combinators” that do stack shuffling but at a higher level than swap/dup/drop/etc. Once you get past the initial “this is stack stuff”, things look and feel much more familiar.
Uiua seems like a much simpler language that’s more like APL? So the input issue remains, and it has a more limited standard library and doesn’t have all the IDE/doc stuff
@mega @mcc yeah if I sound enthusiastic it’s because Factor is one of those languages I really really wish I’d be able to justify regularly writing software in. But it’s very lightly maintained these days and never crossed certain thresholds that I feel are table stakes for modern languages (package management and either async or M:N microthreading). All the community libraries are literally bundled with the language itself heh.
And ofc, there’s less effort on perf work, by virtue of it just not having that much attention on it anymore.
But the foundation is SO good, and you can still get a lot of real stuff done with it.
@mcc like it has the whole dedicated dev environment with integrated docs and browsing and clicking around and stuff. Pretty amazing.
@mcc There you go again, you silly goose, pretending programming is an exact science.
@mcc "Many Forths" is my favorite Madeline L'Engle book.
(Alternative joke: "Many Forths" is a pretty good newspaper comic)
@mcc the book to use for "I don't know enough Forth yet to understand this Forth" is Starting Forth https://www.forth.com/starting-forth/
The word to use for "I don't understand what this Forth system is doing with this word" is SEE, which disassembles the word into its components(and in more developed systems, goes all the way down to the assembly instructions). pForth does have SEE, fortunately.
@Triplefox I don't feel "ABORT SEE" is communicating anything to me.
@mcc try "SEE SEE"
@Triplefox @mcc Ah, yes, this is one of the places where the Forth/RPN languages always confuse you; while normally you have arguments pushing to the stack followed by a function that consumes them, special syntax like SEE is backwards from that, you write SEE first and then the thing it applies to (because you don't want to actually evaluate the argument, which is what would happen if it comes first).
So it would be SEE ABORT. It looks like it just prints out the code of the abort function, which isn't very helpful:
SEE ABORT
( -E0BE1E0 ) DEFER (ABORT) ;
ok
@unlambda @mcc it's actually more helpful than it looks if you keep breaking it down!
DEFER is one of the parser-altering words used in metaprogramming, and (ABORT) is a separate word containing the concrete effects of running ABORT. So to continue breaking it down you could either do SEE DEFER to study the parse, or SEE (ABORT) to see how it resets interpreter state.
@mcc idk, message me? :3
@mcc A segfault in a generally memory safe language would suggest a stack overflow to me. Any chance of that? Can you debug printing to see if anything is going on before the segfault?
@porglezomp @unlambda @mcc Nope. Not even slightly.
@darkling @porglezomp @unlambda I am not using the "peek/poke" commands. i have two VARIABLEs, that's it.
@mcc @porglezomp @unlambda A `VARIABLE` is literally just a word that puts an address onto the stack, so that you can use `!` and `@` on it...
@porglezomp @mcc Oh, maybe not. I generally kind of have the assumption that most languages other than C/C++ have bounds checks by default and no pointer operations without accessing some explicit unsafe/peek/poke kind of operations, but I guess that was a lot less common in the era that Forth was written.
@unlambda Yeah, Forth definitely gives you low level memory manipulation access
@mcc Forth isn't an interpreted language, though. It's compiled. (Granted, almost all of the compiled code is branch instructions to the definitions of words, but it's still compiled).
@mcc It's probably a stack underflow. That's where most of my segfaults come from.
I usually stick in a metric shitload of `.S` and see if the stack is actually what I think it should be, and where it might be underflowing.
@mcc This is a language developed in the 1970s. You can't waste precious clock cycles on bounds checking every time you pop something off the stack! What sort of profligate fool are you?!
@mcc the mode of the interpreter is still in the "interpreting" state(what you use at the repl) - if you place that code inside a word definition it will now be in compiler mode
@Triplefox Can word definitions be nested inside other word definitions?
@mcc let me put it this way.
What the modes are doing in a typical system is deciding whether the parser needs to consume something right away, or writing it to a buffer in the dictionary. There are actually three modes because you need a third one to do the fancy metaprogramming shenanigans that let you do what you're asking. Most of the bootstrapping of a Forth takes place through flipping between executing the word and compiling the word.
So, of course, it's possible, but not in the abstracted way that most PLs present. You have to understand the internals at some level, and once you do, you really have most of what's interesting about Forth-the-language.
@mcc control flow words like BEGIN, UNTIL, IF, ELSE, THEN, etc should only be used inside a word definition. they are "compile-only" words, can't be used in interpreter mode.
Some forths have ways to handle these, but it's not standard.
@mcc so, if you wrap the BEGIN loop in a word definition and then call it, you should be good to go.
: my-loop
BEGIN
...
UNTIL
;
my-loop
@typeswitch @mcc It's wild that the failure mode for that is a segfault rather than an error message.
@xgranade @typeswitch it's not. updating
@xgranade @mcc in simpler forths these words will be executed as if they were in compile mode anyway. in general this will have some truly awful consequences ... like IF will not test anything, it will emit a conditional branch instructian and push an address on the stack, ELSE and THEN will take an address off the stack and write to that address, etc. so a segfault is the best you can hope for really.
@mcc FORTH is basically a macro assembler. or uh, macro code generator, really. tho there is an "interpreter" that recognizes words and immediately executes them.
if you call certain words, it corrupts the interpreter state. so they can only be called from compiled words.
or at least that's how it used to be done. we don't know about modern forths.
@mcc either way...
sorry to hear
@mcc Most of the useful stuff for actual programming (like conditionals, loops) can only be executed once compiled. That's usually done with : to create a new word with the compiled code in it.
(Also, that code's massively non-idiomatic, with a massive C accent; The Forth way is effectively about building a DSL with words that you can use to solve your problem: I've seen it described as "language-oriented programming".)
@mcc Oh, and line 16 probably isn't going to do what you wanted -- it's doing (((n != 9) != 13) != 20) && n && n.
@mcc you're starting to see how forth is more of a "way of life" than a programming language. the community convention is that you write your own forth instead of buying one off the shelf, lol
@mcc FWIW, what I referred to really implement a Forth for the first time was Forth-83. It's a smaller standard that does less, provides less and gets at the core of the language faster. When you introduce the things that make, e.g., gForth convenient for debugging, you start to have a fundamentally different kind of system because it has to abstract everything instead of failing in an implementation-legible way.
That's why Moore's Forths are divergent from the standards-driven Forths, which aim to be industry-friendly. Even so, when you look at the ANS and Forth2012 standards in depth, it's mostly a rough agreement on, "these words kinda do the same thing here".
@mcc@mastodon.social time to invent jorth (jean Forth)