[ANN] Orsetto: structured data interchange languages (preview release)


I am pleased to announce that I’ve reached the preview milestone I set for my Orsetto project. As I wrote in the README file about it:

Orsetto is a standalone library comprising a core toolkit…

  • Core functional data structures and processes.
  • Unicode transport, normalization, parsing and formatting.
  • General purpose packet format encoder-decoder processes.

…used to implement streaming parsers and formatters for a useful variety of
structured data interchange languages…

In the preview release, the major featured languages are only JSON and CBOR, but my hope is to expand this list to include a variety of other useful languages. The programming interfaces are sufficiently different from other implementations that I feel Orsetto may be a welcome alternative to have available.

Orsetto is currently available from my personal OPAM repository, which you can use in the conventional way:

opam repository add jhwoodyatt git+https://bitbucket.org/jhw/opam-personal.git

In two weeks, unless discussion here convinces me to delay or defer, then I will request to make Orsetto available on the public OPAM repository along with a commitment to make patch releases as necessary to correct errors.

At this time, I’m inviting the OCaml community to give it a look, post comments and questions about it here, file issues on the Issue tracker if you notice anything wrong. I’m especially interested in knowing about name conflicts that I need to avoid. Once I push to the public OPAM repository, I want to be able to move quickly toward its first stable release.


1 Like


Some news.

  • I pushed another preview revision to my personal OPAM repository. It just upgrades from Unicode 11 to Unicode 12.
  • I’ve been stalling on releasing 1.0 to the community OPAM repository because I’ve been waiting to see how much I would need to do to support OCaml 4.08, except sadly 4.08+beta2 is incompatible with ppx_tools and ppx_migrate_parsetools, on which Orsetto has a dependency. I’m trying to decide whether to drop the dependency or wait a little while longer. The more I look at the PPX world, the less robust it looks— I’m strongly leaning toward dropping the dependency.


I think if you can afford it you should do it. Somehow ppx_migrate_parsetools has to follow the evolution of the ast so there will always be a bit of lag on new OCaml releases and/or it will often break if you want to test trunk.

1 Like


It’s not just ppx_migrate_parsetools. There is also the split between ppx_tools and ppx_tools_versioned, which appear to me as competing forks of the same library, both actively under separate maintenance and each used extensively by the community. This is also a problem for the multicore port, which also has new syntax that PPX needs to know about. It seems like everything about PPX smells like “experimental” and it’s weird that so much of the OPAM directory is dependent on these tools that are so tightly coupled to the abstract syntax.



I have released ~preview3 which improves compatibility with OCaml 4.08+beta2, drops the dependency on ppx_deriving and adds a dependency on stdlib-shims which I hope will maintain compatibility with the main compiler beta packages more closely than the PPX world seems to be tracking them.



I have now released ~preview4 which resolves Issue #8 OCaml 4.07: the new Stdlib.Seq.t is functionally equivalent to Cf_seq.t. For OCaml 4.06, this introduces an external dependency on the seq compatibility package. I’ve also checked that documentary comments are available with odig, so this might be the last preview release before 1.0. (It depends on whether I decide to remove the support for the ppx_let syntax extension.)



It depends on whether I decide to remove the support for the ppx_let syntax extension.

I’ve thought about this, and I will not be removing support for the ppx_let syntax extension. I plan to deprecate it when OCaml 4.08 is released, but it will be retained while I continue supporting OCaml 4.06 and 4.07.



Hi, have you thought about Thrift & protobufs support? I mention b/c … well, as a systems-builder, whenever I reach for a distributed system, I’m also reaching for Thrift, b/c inevitably I need to support some client/server written in C++. [Of course, they’re also designed for performance] Just a thought …



p1. I’ve not given a lot of thought Thrift because of how its RPC semantics are so tightly coupled with its specification. My attention has mainly been focused on structured data interchange languages.

p2. I’m still thinking out how to deal with structured data modeling languages that are tightly coupled to their corresponding interchange languages, e.g. ASN.1 and BER/DER/xER; Google Protocol Buffers; YANG and NETCONF; CDDL and CBOR; et cetera.

You may want to follow issue #37 and #38, which are about Google Protocol Buffers and generic structured data modeling respectively.