Dear camlidae,
I’m happy to announce the release of ocaml-protoc 3.0 (alongside the multiple runtime libraries: pbrt
, pbrt_yojson
, and the new pbrt_services
). This is a majorly breaking release, I’m sorry for that (I do, however, believe it necessary) and recommend every user of ocaml-protoc add an upper bound < 3.0
to their current project and migrate when they have time.
First, a summary. ocaml-protoc
is a self-contained compiler that turns protobuf IDL files (.proto
files) into OCaml types, pretty-printers, and (de)serialization functions. The runtime library pbrt
(“protobuf runtime”) contains support code for printers and binary (de)serialization; pbrt_yojson
contains support code for JSON (de)serialization by way of yojson
.
So what changed with ocaml-protoc 3.0? Many things.
For a start, from one .proto
file we now generate one pair of .ml
and .mli
files instead of several pairs. This reduces the boilerplate in build systems and simplifies user code overall (one module per .proto
file). A large internal refactor of ocaml-protoc
was done prior to the integration of… services.
The major new feature of ocaml-protoc
3.0 is the support for service declarations. These are essentially a way to describe RPC endpoints, next to the types used to interact with the endpoint (example; full generated code). This is typically what it used in systems such as gRPC. Now ocaml-protoc
generates server and client stubs for each endpoint, that pack together the type definitions and the relevant (de)serializers; that code doesn’t presume anything about a concrete RPC system. I have in the works a simple Twirp OCaml library that relies on this generated code to provide services over HTTP 1.1; It is also possible to write RPC systems over ZMQ, websockets, etc. without changes to the generated code[1].
Another big-ish change is how the generated code looks like, at least when it comes to binary (de)serialization. ocaml-protoc
3.0
comes with significant speedups for encoding (up to twice the throughput; order of magnitude reduction in allocations in some cases [2]) and some less impressive speedups for decoding. This is a combination of multiple changes:
- use of a few C stubs to accelerate varint decoding/encoding;
- encoding is done back-to-front, which allows the encoder to use a single slice internally[^3]. This is what required changes in the generated code in the first place;
- encoding code now requires a lot fewer closures (passing arguments explicitly instead) which reduces allocations to almost nothing.
I haven’t recently benchmarked against other protobuf implementations in OCaml, but I’m reasonably confident that this is now the fastest one by a healthy margin.
There are also other improvements and bugfixes. I want to thank in particular @Konstantin_Olkhovski for some of these contributions and for very helpful discussions, and also @VPhantom for more discussions on the topic of performance.
The changelog contains many more details.
if the encoder type is reused, there’s almost no minor allocations, and no major allocations, to encode an existing value into the encoder’s buffer. ↩︎
because sub-messages use varint as their sizes, encoding front-to-back cannot be efficiently done in a single buffer, because it’s not clear how many bytes to reserve in front of a sub-message. With back-to-front that’s not an issue. ↩︎