Factorial on SubX

akkartik.name/images/20180730-

Ok, I think I understand calling conventions now.

Also coming face to face with the pain of debugging machine code 😀

github.com/akkartik/mu/commit/

Kartik Agaram
Follow

Now that it can translate labels to offsets, SubX also warns on explicit use of error-prone raw offsets. Both when running and in Vim.

As I build up the ladder of abstractions I want to pull up the ladder behind me:

a) Unsafe programs will always work.
b) But unsafe programs will always emit warnings.

As long as SubX programs are always distributed in source form, it will be easy to check for unsafe code. Coming soon: type- and bounds-checking.

github.com/akkartik/mu/tree/ma

@freakazoid @h @rain1

SubX now supports basic file operation syscalls: akkartik.github.io/mu/html/sub

I've also made labels a little safer, so you can't call to inside a function, or jump to within a different function: akkartik.github.io/mu/html/sub

Next stop: socket syscalls! @h

github.com/akkartik/mu/tree/ma

@freakazoid @rain1 @vertigo

@h Hmm, the socket syscalls are implemented differently on different platforms. That's dismaying. I'd been hoping to use a set of primitives so tiny that programs for it would work on all modern, extant *nixes. Now I probably need to start testing on Linux (top priority) and all the different *BSDs including Darwin (which is what I develop on).

:(

@freakazoid @rain1 @vertigo

@akkartik @h @freakazoid @rain1 @vertigo So BSD sockets are implemented in different ways on different platforms?

Makes sense enough, I suppose, given epoll vs kqueue vs IOCP

@yumaikas I'm not sure. Darwin certainly does them differently from Linux.

Linux has a single `socketcall()` syscall (number 102) that multiplexes standard Posix functions like `socket()`, `connect()`, etc. man7.org/linux/man-pages/man2/; syscalls.kernelgrok.com.

Darwin has separate syscalls for each Posix function. `socket()` is 97, `connect()` is 98, and so on. opensource.apple.com/source/xn

Hopefully the BSDs all share a common approach. Bears investigation.

@h @freakazoid @rain1 @vertigo

@akkartik @h @freakazoid @rain1 @vertigo what (if anything) would keep you from linking to a library and calling that instead via SubX, since posix is a thing? Or is does SubX not have a C FFI set up?

@yumaikas Yes, I'll probably have to swap in a different library for each platform.

> what would keep you from linking to a library?

Sheer bull-headedness :) SubX requires nothing more than a kernel to run. Not even a linker. I'd like to preserve this property when it starts self-hosting.

> does SubX not have a C FFI set up?

Interop with C or Posix is an anti-goal. The whole goal is to eliminate fixed interfaces, not provide yet another ossified interface.

@h @freakazoid @rain1 @vertigo

@akkartik It doesn't seem too difficult to provide a Linux-like stub to uniformise socketcall() across different systems. Even Windows, which originally was just stolen from BSD wholesale.

@vertigo @rain1 @freakazoid @yumaikas

@yumaikas @freakazoid @rain1 @vertigo @akkartik Er... Windows sockets were originally just BSD sockets.

@h Yeah, I've been expecting the need for libraries. That seems pretty timeless.

What I'm going through the five stages of grief about is the need to test on multiple platforms.

@yumaikas @freakazoid @rain1 @vertigo

@akkartik I don't think it will be too painful, maybe you can get by with simple includes for now, no namespaces, no automatic choice of implementation for a given target arch, nothing of thesort. Plain includes seem relatively easier to implement.

@vertigo @rain1 @freakazoid @yumaikas

@yumaikas @freakazoid @rain1 @vertigo @akkartik
The only annoying task to implement plain includes is perhaps checking for recursive inclusions.

@akkartik But that, too, is unnecessary for a proof-of-concept generating stubs. You may only need #ifdef and #include equivalents. Or perhaps something like Fo's assembly modules, each in a separate file labeled for each target arch. If you do it that way you don't need tne annoyance of implementing #ifdef blocks. Just includes and nothing more.

@vertigo @rain1 @freakazoid @yumaikas

@h Mu has a tradition of loading all reasonable-looking files in its directory in a well-defined order. I may do that rather than includes.

Loading code from multiple files is not a big deal, I was planning on it anyway.

No, my pain is more about needing to spin up a VM or VPS with different OSs, and develop/test/debug on them.

You're right that it's a one-time thing when building the shims/polyfills. Just needs doing. I may start on self-hosting first.

@vertigo @rain1 @freakazoid @yumaikas

@akkartik Yeah it doesn't even have to be a socketcall() port, it could be a BSD port since it's really Linux who went a different way. All of the BSDs, Darwin, and Windowsnd

@yumaikas @freakazoid @rain1 @vertigo

@vertigo @rain1 @freakazoid @yumaikas @akkartik

All the BSDs, Darwin, and Windows follow the BSD sockets api, so itmaymake more sense to only provide appropriate shims for Linux the other way around.

@h @akkartik @yumaikas @freakazoid @rain1 Windows uses its DLL interface to invoke sockets calls; it does not rely on x86-style INT traps like Linux or its BSD equivalent. To work with Windows, you'll need to support dynamic loading and linking.

@vertigo Yeah, Windows is out of scope for the moment. I'm sure there are many more bugs hiding under that particular rock.

@h My sense is that the "BSD sockets API" is just at the level of C prototypes, just like Posix. Do you happen to have any pointers to how the API is implemented in syscalls in different BSDs?

@yumaikas @freakazoid @rain1

@akkartik I think the most straightforward guide you can even borrow code from directly is FASM. That will probably save you an awful lot of time.
flatassembler.net

Even if Windows is out of the scope, flatassembler also includes support for win32 calls and generation of PE32 binaries, so that's not a limitation (although @vertigo is right about Windows being a massive dynamic linking annoyance, with no officially published kernel apis)

@rain1 @freakazoid @yumaikas @vertigo

@yumaikas @freakazoid @rain1 @vertigo @akkartik

Apologies, I was going to say something more, but my instance is experiencing difficulties, I'm going to be using this backup account temporarily today.

@h @haitch Thanks for the reminder about FASM. I may just drop SubX and switch to it.

I've been avoiding self-hosted languages/platforms; the circular dependency of semantics on previous versions bars my goal of thin, easy-to-traverse abstractions. But for an assembler it doesn't feel like as big a deal because of the 1-to-1 mapping between source and binary.

Unfortunately, FASM doesn't seem to have any socket support :( On any platform. So no shims here.

@yumaikas @freakazoid @rain1 @vertigo

@akkartik Oh take a look at the programming samples. I think there's one sockets example, although that may be OS-specific, you can still take a look at the binary it builds.

@h @yumaikas @freakazoid @rain1 @vertigo

@haitch @h Yes, of course, it's Assembly so one can do anything with it. But the codebase doesn't in itself *encode* how to make socket calls. On any platform, let alone multiple ones. `grep` returns 0 results.

Searching the net shows me plenty of examples -- and always there's the question of how to make it cross-platform.

So I could use FASM, but it seems independent of the need to figure out how sockets are implemented on different platforms.

@yumaikas @freakazoid @rain1 @vertigo

@akkartik Why would you drop SubX? I was under the impression that your goals are different than flatassembler's.

@h @yumaikas @freakazoid @rain1 @vertigo

@h @haitch

My broad goal is a platform that I can quickly drill into as deep as necessary. Without knowing the entire stack; JIT learning of the minimum necessary for my immediate purposes.

SubX is just a means to a sub-goal: conventional compilers are too 'thick' to permit easy hackery. Particularly simultaneously hacking within as well as atop them. The context switch is a vast chasm right now. A stack based on FASM would be a vast improvement on that.

@vertigo @yumaikas @freakazoid @rain1

@akkartik We probably discussed this weeks ago, just a reminder that flatasembler 1.0 is an entirely different assembler from fasmg. They have a nomenclature problem there, but it's important to be aware of the differences. The main differences being that fasmg has a new advanced macro system and can output mach-o binaries.

@haitch @h Ah, good point. I had it downloaded but forgot to look inside it.

With regard to syscalls the situation is the same:

```
$ grep '^[a-z]' linux/system.inc
system_init:
system_shutdown:
malloc:
malloc_fixed:
malloc_growable:
realloc:
mfree:
open:
create:
write:
read:
close:
lseek:
get_timestamp:
display_string:
display_error_string:
get_environment_variable:
```

Pretty much the same on every platform across both projects.

I'll still be focusing on FASM g rather than FASM 1. Thanks!

@akkartik @h @haitch @yumaikas @freakazoid @rain1 I can honestly say that the project would not still be going ahead like it is now had I not read Chuck Moore's Programming a Problem-Oriented Language.

I can confirm that Forth is a language that, in the general case and if you keep things simple, can be bootstrapped from raw assembly.

My Kestrel-3/E2 port of DX-Forth is less than 10KB of code too.

(It'll get larger when I implement limited 9P support for it though.)

@vertigo Forth is definitely bootstrappable. But the extreme lack of checking makes it very hard to run with somebody else's code. Specifically the lack of good error messages when I pass the wrong number or type of arguments to a function.

It took me months to reluctantly move on and consider alternatives: lobste.rs/s/0myzye/thoughts_on

We've chatted briefly about this before, though perhaps I wasn't clear then: mastodon.social/@akkartik/1003

@h @haitch @yumaikas @freakazoid @rain1

@akkartik @vertigo @h @haitch @freakazoid @rain1 I take it that writing stack effect checker was probably going to be out of scope for this project?

@yumaikas @akkartik @h @haitch @freakazoid @rain1 To be effective for compile-time checking, you'd need to embed stack depth checking in the compiler itself and arity information within the dictionary. Doable, but inconvenient.

Most Forth systems actually have stack checking at interpret-time; however, by the time it detects a depth imbalance, it's often too late.

@akkartik @h @haitch @yumaikas @freakazoid @rain1 No, you were crystal clear. I just don't necessarily agree 100% with your views or findings. (But I do respect them.)

@vertigo To back up to your earlier comment, I'm *very* amenable to being persuaded. It's a loosely held opinion; I know nowhere near enough Forth to be sure.

I'd love to hear elaboration or about examples where Forth programmers are able to collaborate effectively across space and time. It's quite possible it's just a skill I can learn.

@h @haitch @yumaikas @freakazoid @rain1

@vertigo
Where can I read about the kestral 3 project?

@rain1 I have resources scattered around the web; it's all very disorganized, I'm afraid.

This website is old and out-dated, but provides motivations: kestrelcomputer.github.io/kest

This website is the currently active repository (I no longer rely on Github for development): chiselapp.com/user/kc5tja/repo

If these sites do not answer any questions you have, I'm happy to answer them here and update one of the sites accordingly.

@rain1 Oh, I keep forgetting to update it, but I also have a project page at hackaday.io: hackaday.io/project/10035-kest

(I have one for the Kestrel-2DX as well, but that's a project which is no longer being developed.)

@akkartik @h @haitch @yumaikas @freakazoid @vertigo

A self hosted system isn't defined by its source code alone. it's defined by a pair (source, binary). and the binary is basically impenetrable. So I agree that it can be difficult to understand when essential knowledge about how the system works is opaque.

There are cases where it can be OK: Like transpiling a higher level language to a high level one. In other words when you only add some syntax, not extra semantics.

@akkartik @h @haitch @yumaikas @freakazoid @vertigo

But there has to be self expressed systems like this, we wont be able to remove them completely. I am very interested in the idea of SubX for this reason. Given (source, subx binary) it will be possible to study both.

Maybe the two can be viewed together in some way, like a literate program.

@rain1 Yeah.

There's a dependency on the human cortex here. We look at code to understand it, and we find text easier to look at than binary. Since computers execute binaries, there has to be some recurrence at bottom. But it can be minimal. It's easy, for example, to use a third-party tool like `xxd` to check that a list of hex codes are identical to the contents of a binary. Now you have text, and you can build up from there.

@h @haitch @yumaikas @freakazoid @vertigo

@h @akkartik @freakazoid @rain1 @vertigo Can docker/lxc run OSX/BSD and Windows on a linux machine? You're still taking time to spin up and configure 3-6 VMs when you start doing more serious testing, or having friends test on their machines

Sign in to participate in the conversation
Mastodon

Follow friends and discover new ones. Publish anything you want: links, pictures, text, video. This server is run by the main developers of the Mastodon project. Everyone is welcome as long as you follow our code of conduct!