I’ve been trying to study the runtime system of OCaml by digging through the codebase, so please bear with my limited understanding.
I came across this question because I saw
gc_regs and its use in
roots_nat.c. It seemed a bit redundant, since each “OCaml frame chunk” only has a few
gc_regs roots, but we are checking them when scanning each frame (distinguishing stack vs reg root based on 1st bit of
frame_descr::live_ofs). So I looked more into this and think I understood why we need it, but on a second thought it felt somewhat unnecessary.
For instance, instead of
... call caml_alloc1 ...
... subq $16, %r15 cmpq %r15, Caml_state(young_limit) jbe cont ... # push live regs onto stack, in contrast with aggressive spilling in `caml_call_gc` mov caml_garbage_collection, %rax call caml_c_call movq Caml_state(young_ptr), %r15 cont: ...
This way, live regs are always saved onto stack, and we can analyze where they are since we are doing a normal function call to
caml_garbage_collection, whereas the current approach seems to
- assume no reg is touched when first calling
- aggressively save all regs when we have to call
- now before calling
caml_alloc*, we actually have live regs, instead of live stack slots, and we use that information to trace the live roots out of all the aggressively spilled regs.
To me, inlining seems to be faster, and allows us to get rid of
gc_regs and the bit hack on
One drawback I see with this inlining is code bloat, but is that actually the reason why it’s not implemented?