To be clear, checkpointing means storing a program state between toplevel commands and recovering to the state when a user wants:
$ ocaml
# let x = 3;;
# x;;
(prints 3)
# #checkpoint "mychk1";;
# let x = 4;;
# x;;
(prints 4)
# #recover "mychk1"; (* recovers the value of x *)
# x;;
(prints 3)
I am aware that if it might be costly to checkpoint if there are lots of variables.
I already tried dmtcp and it fails on an ARM instance of AWS - I left my issue here: `make check` fails on an ARMv8 machine · Issue #1045 · dmtcp/dmtcp · GitHub …
That being said, I think it would be great to have such a tool that is maintained by the OCaml community. I am willing to make a contribution for this if it is not too complicated.
Saving and reloading the OCaml heap is probably not too hard. But a running OCaml program may also contain C memory and various runtime structures, and doing this right sounds very tricky and also something that can be solved by a generic OS-specific approach that is not in any way specific to OCaml. (Apparently the latest Linux checkpointing program on the block is CRIU, I would try this.)
[All of these recollections are “IIRC”] In the mid-80s I wrote code for GNU Emacs to properly unexec() (that’s what it was called in Emacs; it waas called undump in TeX) on NCR Tower machines (SysVR2 – jesus, what dreck! “shell layers” barf “streams modules” retch “no demand paging” hurl). This kind of code is very, very OS/arch-specific, and there’s very little that can make it be otherwise. It might be worth thinking about why you want this, and whether there are other ways of getting the same thing.
For instance, if what you’re looking for is the ability to unload the variable-bindings and module-loads of a particular toplevel environment, and then reload them later, you might want to look into actually just implementing that.
It’s not quite as good as a first-class environment support, but on a relevant note w.r.t to checkpointing, previously I’ve implemented a checkpointing-based proof-general-style interface on top of utop and GNU+Emacs.
In order to preserve the state, I use the same strategy employed by OCamldebug for time-travel debugging, and add additional commands to the utop initialisation to fork the process (suspending the parent) whenever a checkpoint is required:
The benefit of this is that it requires very little modification to the environment, and is implemented purely as an GNU+Emacs package - it simply runs an instance of utop and initialises the context with appropriate helpers to save and restore checkpoints.