[ANN] Orsetto: structured data interchange languages (version 1.0)

I am pleased to announce the release of version 1.0 of my Orsetto project. As I wrote in the README file about it:

Orsetto is a standalone library comprising a core toolkit…

  • Core functional data structures and processes.
  • Unicode transport, normalization, parsing and formatting.
  • General purpose packet format encoder-decoder processes.

…used to implement streaming parsers and formatters for a useful variety of
structured data interchange languages…

In this first release 1.0, the major featured languages are only JSON and CBOR, but my hope is to expand this list to include a variety of other useful languages in the 1.x release series. Moreover, it’s only non-build and non-test dependency is the OCaml distribution itself. The programming interfaces are fairly low-level, and sufficiently different from other implementations that I feel Orsetto may be a welcome alternative to other serialization libraries.

Orsetto is now available at the community OPAM repository, and preview releases of forthcoming versions will continue to be available at my personal repository in Bitbucket, which you can use in the conventional way:

opam repository add jhwoodyatt git+https://bitbucket.org/jhw/opam-personal.git

I have been hacking on various personal projects in OCaml for about seventeen years now, and Orsetto represents the portions of all that I find myself regularly reusing. I’m now promising the OCaml community to be as responsive to issues filed on the Issue tracker as my day job allows, and I welcome contributions and criticisms.

p.s. Now that I’m no longer a Googler, I may now be free to work on some of the features that I deliberately avoided, in compliance with my employment contract while I was employed there.


Is the project distributed under any license?

The BSD 2-clause license is default. If you would like another license, I’m open to negotiation.

Aha, thanks! BSD 2-clause is fine for me, and I am interested in contributing in future when I’m free.

I was asking as I can’t seem to find a license section in the README or a LICENSE file.

EDIT: Are you open to the option of migrating to dune?

p1. Yeah, I’ll add an explicit LICENSE file in the unstable branch. All the source files have BSD 2-clause license headers accompanying the copyright declaration.

p2. When I started this project years and years ago, before even Dune’s predecessor jbuilder was available, it used my personal OMake library, which I also released simultaneously (under a BSD 2-clause license as well) as the Conjury package. It has some features I like that Dune still lacks. Adding an optional Dune build would be something I would consider, but only once I understand how to make it do what I’m able to do with Conjury. In particular, I very much like that Orsetto is one OPAM package that installs with ocamlfind an “orsetto” package with each of the various libraries in the framework offered as a subpackage. I don’t currently understand how to do that with Dune.

1 Like

A few days ago I ran the Orsetto JSON parser through nst@github’s JSON test suite and it revealed a couple of problems. Fixes are already in the branch for the 1.0.1 release I plan to make at the end of the month.

The 1.0.1 release is in the merge queue for the community OPAM repository now.

What sort of features do you need that are missing in Dune? Do feel free to create issues on the Dune GitHub issue tracker with use-cases that would help you.

Said it’s missing features I would like, not that I need.

Dune is billed as an “opinionated build system” that is “designed for Reason and OCaml projects only” (emphasis mine). I don’t feel the need to litigate over what amount to my differences of opinion except to say that a reason I like OMake is that it’s not trying to be a build system for any particular language tool chain. It just wants to be— and I think it succeeds admirably at being— a good replacement for the venerable make(1), and I really like that about it.

That said, I’m not opposed to offering Dune project files for Orsetto, provided I can figure out how to make it a seemless replacement for OMake+Conjury. It might already be capable of doing that— I just don’t know how to use it, and I haven’t tried to learn it because— well, um— the few things that Dune actually does that OMake+Conjury cannot yet do are not terribly difficult to fix, and I’ll get around to it when the need becomes pressing.

As I mentioned before, one of the things I don’t yet know how to do with Dune is to deliver a single opam package comprising multiple ocamlfind packages, each with varying interdependencies and external dependencies of their own. It appears from my cursory evaluation that this isn’t the usual mode of delivering libraries with Dune.

Instead the pattern I frequently see is that each internal library is delivered as a separate opam package built from the same workspace. That model really sets my teeth on edge, and I’m resisting it. I don’t know whether that’s one of the “opinions” inherent in the Dune build system, or if it really represents a problem that should be filed in the issue tracker. I searched for it in the issue tracker, and I didn’t see it, which leads me to believe it’s the former and not the latter.

As a side note, it’s not clear to me why anybody should care that Orsetto doesn’t have a Dune build system. Isn’t that supposed to be one of the nice features of opam? You don’t have to care what particular build system a package is using— you only care that it installs things into the OPAM switch where it can be referenced by other packages.

I guess you are looking for something like that: https://dune.readthedocs.io/en/latest/installation.html?highlight=public_name#libraries

If you name your libraries orsetto.foo they will be installed as an ocamlfind sub-library for orsetto. And you can use either local or external libraries in the libraries stanza so it’s easy to have complex inter-dependencies.

You can also check https://github.com/mjambon/dune-starter for a relatively complex template for dune projects.

Ah, yes— so that’s how Dune does something like what I want. Except I’m also not an admirer of the -no-alias-deps compiler flag, and I suspect that’s in the category of “difference of opinion” not a lack of a feature. Although, issue #1819 is relevant here.

So, it looks like all the features I would need are available in Dune, but the question remains: why would anybody care which build systems I’m using? It would be a fair bit of work to develop a parallel build system for Orsetto that uses Dune (mainly because of the code generator in the Unicode library), and I’m not understanding what problem it would solve.

If you don’t like -no-alias-deps, you may use the (wrapped false) mode to disable it. I’m not in any way trying to convince you to use dune, just making a remark :slight_smile:

Oh! Okay, that’s cool.

So, I guess the TL;DR here is that as soon as I understand the nature of the problem that would be solved by adding support for the Dune build system to Orsetto, I can file an issue to cover it and give it all due consideration.

This is a feature that comes with disadvantages. Some people are working on a dune only package manager and it seems to have some compelling advantages:

  • The ability to edit dependencies without pins
  • Much faster build times with composed projects
  • Free cross compilation

Having dune files in your projects makes them usable in such a setup.

All of those are reasons to add Bazel as a supported build system. And Bazel has the advantage of being a mature system, with support for a wide variety of programming languages, not just OCaml, and which I use in my day job already. Not sure how this forthcoming Dune system will be better than Bazel. Any pointers would be nice.

Indeed that’s a good point. Bazel/Buck alone does not provide a full story because they’re too heavyweight for the average open source project. However, I realize that for companies with existing polyglot monorepos, they’re the only feasible option.

I think there’s one crucial way in which dune is going to stay much better than Bazel/Buck and other OCaml build systems: it has the best rules for building OCaml code by far. As far as I know, it’s the only system that has all of:

  • Correctly uses ppx’s by building drivers.
  • Allows transparent definition of ppx rewriters with runtime dependencies
  • Understands -opaque
  • Protects the user from toplevel name collisions for wrapped libraries and executables
  • Supports virtual libraries (the linking hack generalized)
  • Transparent support for inline tests
  • Support many external tools like jsoo, merlin, odoc.

I would like to share these benefits with users of Bazel/Buck, but it’s still an open question on how to best accomplish that.