[ANN] data-encoding.0.5 release

On behalf of Nomadic Labs, I’m happy to announce the release of data-encoding version 0.5.

This new version brings several bug fixes, some increased test coverage, minor improvements in the API, and a major new feature:

Compact encodings: sub-byte tag sizes

This new version provides a new set of combinators for compact encodings. These compact encodings will handle all the verbose and error-prone bit-twidling process needed to combine multiple sub-byte discriminators into a single byte-size one.

E.g., the encoding let e1 = either (either bool unit) (option bool) uses three bits in the shared tag and zero bytes after that;
the encoding let e2 = either int32 int64 uses one bit in the shared tag and either 4 or 8 bytes to represent the integer;
the product encoding let ee = tup2 e1 e2 uses four (3 + 1) bits in the shared tag and either 4 or 8 bytes to represent the integer of e2.

How to get

The code is available under MIT license on Nomadic Labs / data-encoding · GitLab.

It can be installed via opam.

3 Likes

Hi @raphael-proust! I have a question regarding the connection between data-encoding and json-data-encoding, also developed at Nomadic Labs. The latter seems tied to JSON, whereas the former is more flexible, supporting also binary encodings. However, since data-encoding also supports JSON, doesn’t it subsume json-data-encoding completely?

Hello @dario ,

The data-encoding library uses json-data-encoding for its JSON backend. It delegates conversion from OCaml values into and from JSON to the primitives provided in the interface of json-data-encoding.

In a way, yes, as an end-user you don’t need to use json-data-encoding directly because you can use the Json module of data-encoding instead. There are three possible reasons why you might add json-data-encoding as a (non-transitive) dependency to your project and use it directly in your code:

  • You want to keep the dependency set and the number of abstraction layers as small as possible. E.g., in order to reduce binary size.

  • You want some static guarantees that some encodings are only every used for JSON. E.g., in your logging system.

  • You need to define a JSON encoding which is rejected by data-encoding on grounds that it is invalid in binary. Note that

    • This is very specific to some combinators but basically some combinators will reject their inputs (raise Invalid_argument) because using the serialiser would lead to undecodable data. Most typically, this happens if you try to concatenate two fields of unknown length. Decoding the result becomes a guessing game as to were one field stops and where the next begins. These could easily be represented as an array in JSON which includes all the delimiters you need to decode it.
    • There are other workarounds (e.g., prefixing the fields with a length field), but going for the JSON encoding directly is a valid approach if you only need JSON.

Version 0.5.1 of the data-encoding has just been released.

This is a bugfix release making one of the library’s internal checks more permissive. Without this fix (i.e., using version 0.5), some valid encodings are rejected (raising Invalid_argument) by the library.

You can update via opam: opam install data-encoding.0.5.1