Deserialize a numeric JSON field into a variant using ppx_deriving_yojson

Problem

module Num = struct
  type t =
    | NInt of int
    | NFlt of float
  [@@deriving show, yojson]
 (* snip *)

My incoming JSON payload looks like { "number": 1 } or { "number": 1.1 }, wherein I expect int and float JSON values. I need to get these basic JSON values into my Num type.

ppx_deriving_yojson lets you specify some serializer and deserializers. I’ve tried a variety of fns and syntaxes, and though a bunch of things compile, they just don’t work. The docs are a bit light on the matter, so I’ve been going thru the ppx tests: ppx_deriving_yojson/test_ppx_yojson.ml at e030f13a3450e9cf7d2c43fa04e709ef608486cd · ocaml-ppx/ppx_deriving_yojson · GitHub. The tests are also a bit light in the custom [de]serialization offerings.

Discussion

  • How can I deserialize JSON int & float values gracefully to Num.t?
  • How can I serialize Num.t back to untagged JSON, much like the inputs came in (vs tagged JSON output)?

I considered sharing my failed attempts, but at this point I think it would distract more than help :slight_smile:. Thanks!

2 Likes

The payload here is different enough from the OCaml data type that I think you’re justified to write your own codec. E.g.,

module Num = struct
  type t = NInt of int | NFlt of float [@@deriving show]

  let to_yojson = function
    | NInt i -> `Int i
    | NFlt f -> `Float f

  let of_yojson = function
    | `Int i -> Ok (NInt i)
    | `Float f -> Ok (if Float.is_integer f then NInt (int_of_float f) else NFlt f)
    | json -> Error (Yojson.Safe.to_string json)
end

Now you can type your payload as:

type payload = { number : Num.t } [@@deriving show, yojson]
5 Likes

Amazing. I didn’t realize i could just … implement those fns! Thanks @yawaramin!

2 Likes

I was flummoxed in using ppx’s non-trivally for a while as well, until I had the same revelation, that ppx-emitted code was just looking to refer to definitions with names that followed implicit conventions, as available in the lexical scope where the ppx is used.

Where this is is very handy is that you can trivially customize behaviour in different contexts. e.g. say you have a Foo that is used widely in your program, and you use yojson converters for basic IPC:

module Foo = struct
  type t = ...... [@@deriving yojson]
end

Fine, very convenient. But when you want to present Foos to e.g. users of an external API, the naive implementation won’t be so great:

module API = struct
  (** The type of response provided for [Foo] GET requests  *)
  type foo_get =
    { id : Uuidm.t
    ; name : string
    ; foo : Foo.t
    }
  [@@deriving yojson]
end

With this, API consumers will be exposed to yojson’s rendering of whatever Foo.t is, which is surely not something they’ll be expecting, and which will probably leak all kinds of internal changes over time, thus churning that external API’s compatibility. And, we don’t want to change Foo.to_yojson, as that will impact our IPC usage.

The solution is to just make an API-local Foo, and implement whatever encoding of it makes sense in that context:

module API = struct
  module Foo = struct
    include Foo
    let to_yojson foo = ....
  end

  (** The type of response provided for [Foo] GET requests  *)
  type foo_get =
    { id : Uuidm.t
    ; name : string
    ; foo : Foo.t
    }
  [@@deriving yojson]
end

Now my API-local Foo module with its own purpose-built to_yojson is what is lexically in scope for the code in the derived foo_get_to_yojson function, and my internal IPC usage of the default Foo.to_yojson is undisturbed.


(BTW, yojson does support using an annotation for customizing the serializer used for just one field, e.g.

module API = struct
  let api_foo_to_yojson foo = .......

  (** The type of response provided for [Foo] GET requests  *)
  type foo_get =
    { id : Uuidm.t
    ; name : string
    ; foo : Foo.t [@to_yojson api_foo_to_yojson]
    }
  [@@deriving yojson]
end

This is great if you really only need said customization in a single location, but quickly becomes tedious otherwise, so I find myself defining local copies of modules much more frequently.)

3 Likes

These are good tips. As a third option, you can also use this to avoid [@to_yojson]

  type api_foo = Foo.t
  let api_foo_to_yojson foo = .......
  let api_foo_of_yojson = Foo.of_yojson

  type foo_get =
    { id : Uuidm.t
    ; name : string
    ; foo : api_foo
    }
  [@@deriving yojson]
2 Likes

Hey all, maybe you’ll kindly spare me a bit more advice here. It’s disjoint from the original problem statement, but i realized I need int64 not int in my type, for which the ppx has a bunch of custom processing for, as there is no 'Int64 poly variant in Yojson.Basic.t. Perhaps I need to emulate what the PPX is doing to cram my int64 into something that Yojson.Basic.t accepts, as a cheeky little fake out. However, it’s not clear to me what that Exp.variant expression is up to, so I’m not clear how to emulate it full well.

Technically, JSON specifies the ‘number’ type as IEEE754 double precision. Which corresponds to OCaml float type, so you can read the data as float and try to convert it to int64. However, since int64 has a lower range than float, it will be tricky to avoid underflow/overflow.

let max_int64_flt = Int64.(to_float max_int)
let min_int64_flt = Int64.(to_float min_int)

let of_yojson = function
  | `Int i ->
    Ok (NInt (Int64.of_int i))
  | `Float f ->
    Ok begin
      if f < min_int64_flt || f > max_int64_flt || not (Float.is_integer f) then
        NFlt f
      else
        NInt (Int64.of_float f)
    end
  | json ->
    Error (Yojson.Safe.to_string json)

Encoding as JSON should be simpler of course because Int64.to_float is guaranteed to not underflow/overflow.

ppx_deriving_yojson is intended to be used with Yojson.Safe.t (so that’s the parser you’re supposed to run). Is there a reason you’re using Yojson.Basic.t?

If you’re stuck with Yojson.Basic.t, you can’t encode numbers that do not fit into an int. If you want this behavior but still have an int64 in the record (probably a bad idea), you can use something like:

type small_int64 = int64
let small_int64_of_yojson j = Int64.of_int ([%of_yojson: int64] j)
let small_int64_to_yojson n = [%to_yojson: int] (Int64.to_int n)

… and use small_int64 in your type. ([%of_yojson: t] expands to the derived function for type t).