Version control of records

Hi guys. When I update a record type by adding one field of option type, the deserialization of old version data will fail. Is there any tool that can directly add a none value to obtain new version data from original?
My current idea is some kind as follow code, however when the record type updates frequently, each update will include a module type containing the previous version, and the try block will get inflated.

module Original_t = struct
  type t = {a:int}[@@deriving sexp]
end

type t = {a:int; b:float option}[@@deriving sexp]

let t_of_original x: t =
  let a = x.a in
  {a; b = None}

(* re-impl t_of_sexp *)
let t_of_sexp s =
  try
    t_of_sexp s
  with Of_sexp_error _ ->
    let x = Original.t_of_sexp s in
    t_of_original x    
1 Like

Have you seen ppx_stable? It is sort of getting at what I think you are going for…conversion between versioned/almost identical types. If you don’t want to use the ppx, maybe you can take inspiration from the approach at least.

2 Likes

Have you looked at protobufs and Thrift? Both of these formats (probably other formats too, but these are the ones I know well) were explicitly designed to support adding and subtracting fields in controlled ways, while maintaining compatibility of software with newer and older versions of the data. As an added benefit, they store data in language-independent format, so you can read and write it from multiple languages.

1 Like

A bit more specific to ocaml as not many languages are supported, but GitHub - mfp/extprot: extprot: extensible binary protocols for cross-language communication and long-term serialization does that well.

This is bad practice with GADTs: it is always better to make sure that your type-level tags are distinguishable:

module Labels = struct
  type legacy = private Legacy
  type latest = private Latest
end

otherwise outside of the module defining those tags, the typechecker cannot know that there are distinguishable, and thus you use_latest function cannot be written outside of the module without a non-exhaustive pattern warning.

Note that you can also encode the relationship between the version by using type-level integer:

type z = Z
type !'a s = Succ of 'a
type _ t =
| Version_1:  { x:int } -> z t
| Version_2: {x:int; y:int} -> z s t
| Version_3: {x:int;y:int;z:int } -> z s s t

which makes it possible to write function that works on versions one and later:

let y (type a) (x: a s t) = match x with
| Version_2 {y; _ } -> y
| Version_3 {y; _ } -> y
| _ -> .
6 Likes

my existing data have already get serialized to sexp while this lib seems a comprehensive one like protobuf and I’ve just star it to keep eyes on. thx @Khady @Chet_Murthy

This doesn’t help insofar as you’re already tied to sexprs, but ppx_deriving_yojson does this reasonably well for JSON serializations. It’s not as controlled as e.g. thrift and protobufs, but between its @default option (for when one needs to add a slot to a record) and its strict = false option (for when one wants to consume serializations that might contain slots that have been removed from the live model), it covers all of the cases of type evolution I’ve cared to support so far.

(Incompatible changes it doesn’t account for automatically are IMO probably things one should avoid in general anyway.)

2 Likes

I will echo what Chet and Chas have said, with the addition that versioning of structured data is a pretty difficult problem that you almost certainly will be better off avoiding. Instead, the usual way is to make all the fields optional (this is the default in, for example, Protobuf and GraphQL), and for consumers to require the specific fields they need and handle missing fields.

To add to Chas’ recommendation, the sexp equivalents seem to be:

ppx_deriving_yojson (PDY): [@default]
ppx_deriving_sexp (PDS): [@default]

PDY: [@@deriving yojson { strict = false }]
PDS: [@@deriving sexp] [@@sexp.allow_extra_fields]

1 Like

Nice, I wasn’t aware of those equivalents (or, I had settled in w/ ppx_deriving_yojson before becoming aware of the sexp practice in OCaml-land, can’t remember at this point :stuck_out_tongue:).

Hi ryan I believe this lib converts very neatly between different versions of records and particularly supports changing type of existing field with converting function of specified field as labeled argument