Hi everyone!
This is my first post on this forum, and I’m pleased to present Gendarme, a generic-but-opinionated library to marshal and unmarshal OCaml data types in a variety of formats.
Why a new library?
OCaml has a few libraries in the ppx_deriving family, like the famous ppx_deriving_yojson, allowing to very conveniently generate marshallers and unmarshallers for OCaml types. However, two elements didn’t suit me in this approach:
- These libraries pollute the namespace quite a bit when we start combining them (e.g. when developing user-facing apps that allow ingesting several serialization formats);
- Adding support for a new format is hard and requires some PPX expertise most OCaml users don’t have.
How the project is born
When discovering the Go language, I was pleasantly surprised by how easy it was to marshal and unmarshal structs with simple annotations, and wanted a similar hassle-free mechanism in OCaml. I also wanted to learn about GADTs, as I never had found any use for them in my projects before.
This project was originally named Marshal, but a module of the same name already exists in the standard library. “Gendarme” is one way to translate “Marshal” in French.
What’s particular about Gendarme?
Gendarme is a modular, extendable, PPX-heavy marshaller and unmarshaller based on type witnesses, supporting various data formats (CSV, JSON, TOML, YAML). The curious reader may find way more information on the project’s repository, but here’s the gist:
[%%marshal.load Yojson]
type t = { t_foo: int list [@json "foo"];
t_bar: t list [@json "bar"] } [@@marshal]
type u = t * int [@@marshal]
let v = ({ t_foo = [1; 2]; t_bar = [{ t_foo = [3; 4]; t_bar = [] }] }, 3)
let json = [%encode.Json] ~v u
(*
val json : string = "[{\"foo\":[1,2],\"bar\":[{\"foo\":[3,4],\"bar\":[]}]},3]"
*)
Annotating a type my_type with [@@marshal] and providing the required additional data (such as field names in the case of records) builds a witness value my_type of type my_type Gendarme.ty that tells Gendarme how to marshal and unmarshal values.
Currently, only a subset of OCaml core types are supported, but generic support gets improved as the need appears in my personal and professional projects.
Gendarme was written both with users (application and library developers) and developers (people developing new Gendarme encoders) in mind. If your target format is able to encode objects/records (optional but nice to have), lists/arrays, and supports arbitrarily nesting them, then writing a new encoder requires writing at most 100 lines of what is essentially pattern-matching cases to tell Gendarme what to do with your data (see for example the code for gendarme-ezjsonm). Each encoder is heavily tested, and most of the encoders that we ship have more lines of code for tests than for their actual logic.
This is a very quick introduction, but if you are interested in this project, head over to its README to learn more! I’m obviously happy to answer any questions you may have.