Has anyone here written an OCaml bytecode interpreter from scratch ?
Only the bytecode interpreter part. Not the parser / typechecker / compiler. No optimizations. Just a dumb bytecode interpreter.
I feel like many questions I have regarding (1) transpiling, (2) OCaml runtime, (3) OCaml ↔ other languages would be auto resolved if I had a correct mental model of the OCaml bytecode interpreter.
Has anyone written one from scratch? Any advice / insights ?
Thanks!
PS: I’m okay with using a dumb mark & sweep, global-stop-the-world GC. The main goal is educational, not production use.
I’m also curious about that, because the actual OCaml interpreter has an
annoying limitation: it’s not reentrant, which means you can’t really use it as
an embedded interpreter (and to use the OCaml toplevel you also need
your program to itself be compiled to bytecode).
An alternative, no-global-state, simple to embed interpreter could be
useful for programs extensible with/scriptable with OCaml, imho.
OCaml’s bytecode is a simple 1-register stack machine, so most of its instructions are trivial to implement. But here are a few things to look for:
There are macro-instructions. For example, PUSHACC is actually the sequence of the instructions PUSH and ACC, i.e., it pushes the accumulator register on the stack and then loads it from a random position on the stack.
The caller of a function might push more arguments on the stack than what the called function expects. The interpreter needs to keep track of their number. Indeed, if there is any extra argument, the RETURN instruction does not give control back to the caller. Instead, it assumes that the current accumulator is a closure and gives it immediate control.
The GRAB instruction checks that the caller passed enough arguments onto the stack. If not, it creates a closure, stores all the available arguments in it, points the closure at the instruction just before GRAB (which is assumed to be a RESTART instruction). Then, it returns to the caller, as if executing a RETURN instruction.
Some years ago I wrote a basic bytecode interpreter in Rust. My interpreter is very incomplete. I just made it run an example program I had.
I took as reference the interpreter found in obytelib. For garbage collection, I wrote a simple mark & sweep. Another important part was a deserializer of ocaml values. The bytecode includes serialized constant values.
My objective was to have a small interpreter that could be embedded and having multiple instances of it, just like Lua.
As I mentioned, I did not finish the project, but if you are still interested, I could share the code.
This is a pretty old thread but it’s the only one I found about RESTART/GRAB, so…
…I’ve just looked at the code for RESTART/GRAB in ocaml/runtime/interp.c on github, branch 14.4. There are some things about this mechanism which I cannot fully understand.
Given an environment ENV=env0 with size S and extra-arguments XARGS=xargs0,
RESTART pushes S-2 elements from env0[2] onward onto the stack, then sets ENV :=env0[1] and increments XARGS by S-2;
the following GRAB required_args checks whether XARGS >= required_args;
if so, decrements XARGS by required_args and exits;
otherwise, creates a closure in accu of XARGS+3 = xargs0+S+1 elements, sets accu[0] to address(RESTART), accu[1] to env0[1], and populates all other fields from accu[2] onward by popping all S-2 elements previously pushed by RESTART plus xargs0+1 more; finally it pops new values for PC, ENV and XARGS.
Wouldn’t it be more efficient to avoid S-2 pushes and S-2 pops by moving ENV :=env0[1] from RESTART to GRAB, which could set the fields accu[2]…accu[S+1] straight from env0 and pop only the remaining ones off the stack ?
And more: why two distinct instructions (which require two fetch/decode/exec cycles) instead of a single RESTART_GRAB? Is this a consequence of the bytecode compilation method?
I think you are missing one key point: when defining a function, a closure is created and the code pointer stored in the closure points to the GRAB instruction, not the RESTART instruction. When applying the function, if number of arguments >= function arity, the RESTART instruction is not executed. It is only in the case of partial applications (number of arguments < function arity) that a new intermediate closure is constructed with the code pointer pointing to the RESTART instruction immediately preceding the current GRAB instruction. When, later, more arguments are provided to this intermediate closure, the RESTART instruction is executed which pushes back onto the stack the arguments that had been provided originally (on top of the arguments that were provided more recently), and then execution flows to the GRAB instruction following the RESTART, which again tests if number of arguments >= function arity, if yes, executes function body, otherwise creates intermediate closure, etc.
Yes, I was thinking only at “interpreter level”, totally forgetting why RESTART and GRAB exist… But I’m thinking if it might be possible to have RESTART and GRAB n implemented as I wrote before and introduce a new instruction, let’s say START_GRAB n.
A function would be compiled as [GRAB n; BRANCHto rest of code; START_GRAB n; rest of code], its closure’s code pointer pointing to START_GRAB, which works like the current GRAB.
Or am I missing another key point?
Just to reduce pushes/pops operations… During Covid time I wrote an OCaml interpreter for the Commodore C64 (taking inspiration from a cute project by Benoît Vaugon, GitHub - bvaugon/ocapic: OCaml for PIC microcontrollers ) and its 6510 CPU (1Mhz clock) suffered a lot when executing many “indirect indexed address mode” instructions to operate on the stack
You are right that there is some useless memory churn, but there are two things to consider:
This situation only happens if an already partially applied closure is applied again in a non-fully way. This situation might be so rare, that it is not worth optimizing for it. You should first check if the overhead is noticeable.
You are not forced to consider RESTART and GRAB as two separate instructions. Your interpreter could handle RESTART as a 3-word instruction, which would then behave as your proposed START_GRAB instruction. The advantage is that this does not require any change to the compiler; in particular, no need for the extra jumps of your proposal.
Last night I modified my 5-year-old interpreter code: the first time a closure is applied, GRAB works as usual; but when RESTART is executed (intermediate closure application), it sets a flag to tell the next GRAB that the first S-2 arguments have to be read from ENV instead of the stack. It saves a few CPU cycles and seems to work well.