OCaml YAML library

I know of ocaml-yaml, but it relies on a C FFI binding. I would like to avoid this in order to ensure I can compile to JS and maintain consistent behaviour. (Depending on some JS-native yaml library for JS-targeted builds would work for 99% of cases, but variance in yaml handling between it and ocaml-yaml would inevitably cause all sorts of headaches, e.g. https://arp242.net/yaml-config.html#its-not-portable)

So, is anyone aware of a YAML library similar to ocaml-yaml that doesn’t rely upon any FFI bindings?

1 Like

In general, implementing full Yaml from scratch is a big endeavour, and not a lot of fun unless you really need it. However, I would be willing to accept a pure OCaml version of the JSON-compatible subset as a PR into ocaml-yaml. This would cover the vast majority of usecases of Yaml, and could be fuzz tested alongside the existing C bindings to ensure that their behaviour is the same.

I’m only aware of @kit-ty-kate starting a pure Yaml implementation; noone else has been brave enough to try :wink:

2 Likes

Is there a good YAML test suite out there like the one I found for JSON? The daunting thing about writing a full-featured YAML parser/formatter is the test suite.

YAML does have an official test suite (https://github.com/yaml/yaml-test-suite), but it labors under the same objections that apply to the specification AFAIK, i.e. fundamental security vulnerabilities and JSON incompatibilities that have been discussed in depth elsewhere.

That said, YAML is indeed challenging to implement (even if you aren’t attempting to align with upstream test suites like the one linked above), which is at least part of why so many languages primary YAML support libraries (like ocaml-yaml) very reasonably decide to simply use an existing library via whatever FFI mechanisms they have available to them.

2 Likes

I hope this doesn’t hijack this thread, but I wanted to share some experience I believe is related to the C bindings of the YAML library.

Background
It appears that reasonably easy to get failure’s in the current implementation with larger YAML files. In most case’s simply raises exceptions, but with even larger files I’ve observed segfaults (at least locally).

Issue
I’ve added some additional information to an existing issue, and submitted a PR showing the failure:



CI results ( some of the azure-pipeline os specific failures seem unrelated, so linking to the ocaml-ci results )
https://ci.ocamllabs.io/github/avsm/ocaml-yaml/commit/2cd258cbc49d945f219779f47aa3896601d8a407

I’m happy to work to help fix the issues, but I don’t have very deep OCaml or Ctype knowledge so I fear my troubleshooting will go in random directions until I up my own skill level.

Note, while I believe a native OCaml YAML would be great, I generally agree the effort might not be worth investment unless there is larger demand in the community.

Possible Solutions
I’ve been trying to find solutions, which will be easy to maintain and provide a wide capability and stability needed in parsing library. Sadly I haven’t found much of a solution which meets these requirements. One thought is to write a C or C++ library, which simply converts from YAML -> JSON and JSON -> YAML, we can then use the great native JSON libraries to manage all of our interop. This of course, won’t supply the full feature set, but I suspect it would meet the largest set of needs. If anyone thinks this a worth while feature set, I’m happy to either create an new package or send a PR to the existing ocaml-yaml.

3 Likes

Do you happen to have a link for @kit-ty-kate’s (maybe WIP?) implementation? I’ve not been able to find it. [Edit: nevermind! Found it: https://github.com/kit-ty-kate/yummy]

If nothing is readily afoot, then I may well be diving in shortly. It’s hard to get around being able to rely on the same behaviour on the server and in the browser. Entirely aside from that, I think I’d rather like to accept only some subset of YAML (so as to preclude some of the more egregious failure modes of the format), so relying on an external lib would be counterproductive.