Serialization of a type to binary/text format for caching

Hello everyone,

I am working on extending an existing OCaml project that is no longer maintained. I need to add caching for some of the complex types in that project - specifically, I want to save instances on the hard drive and load them during runtime instead of generating them each time.

For example, I have the following code:

let some_object = func in ... 

I want to be able to write some_object to disk and then load it again later.

Preferably, I don’t want to modify the existing source code of the types, as it would require significant changes (even if I just add tagging).

The best solution I found is using the Marshal library, but from reading on this forum, I understand that it’s not the safest and most stable solution. I intend to save the caching for an unlimited amount of time and still be able to use it.

I explored atdgen, but it would require me to rewrite all the types in the atdgen language, which is not straightforward. I also considered ppx_deriving, but that would necessitate tagging the original source code, which is not an optimal solution for me.

I came across this project - GitHub - jaredly/milk: Milk 🥛 Stress-free serialization & deserialization for Reason/OCaml - but it’s not supported, and I was unable to make it work with OCaml 4.14.0.

So, I would appreciate any advice on how to implement caching for complex types in OCaml. Is Marshal truly an unstable solution, and will I encounter issues in the future? Do I need to manually write serialization and deserialization functions for my types?

You can define your own parallel types using type equations and add in [@@deriving sexp] or whatever serializer/deserializer you like to use.

Look at CompilerPrinters.ml for a [@@deriving show] example. The deriver is added to the “unmodifiable” OCaml compiler source code at https://github.com/ocaml/ocaml/tree/4.14/typing. In my case I wanted MEnv.pp_lookup_error to work on the internal OCaml Env.lookup_error type. The ocamldoc inside CompilerPrinters.ml has a link to the underlying technique.

See Export/import data in OCaml - #10 by yawaramin and other messages in that thread. My suggestion, Umarshal, would not require changes to the existing types.

This limitation can be overcome by using ppx_import. From their README:

type%import longident = Longident.t [@@deriving show]
let () =
  print_endline (show_longident (Longident.parse "Foo.Bar.baz"))
(* Longident.Ldot (Longident.Ldot (Longident.Lident ("Foo"), "Bar"), "baz") *)
1 Like

For a given ocaml version, Marshal.to_string could be your friend.