[ANN] Cmarkit 0.4.0 - CommonMark parser and renderer for OCaml

Hello,

It’s my pleasure to announce a new release of cmarkit, an ISC-licensed CommonMark parser and renderer for OCaml.

This release provides support for the latest version of the CommonMark specification, updated data for Unicode 17.0.0, a notable semantic change in the task item extension (thanks to @samoht) and a couple of bug fixes and improvements mostly in the CommonMark renderer.

All the details are in the release notes. Thanks to everyone who reported issues.

This release is brought to you by essential funding from the OCaml software foundation and my donors.

Homepage: https://erratique.ch/software/cmarkit
Docs: https://erratique.ch/software/cmarkit/doc (or odig doc cmarkit)
Install: opam install cmarkit (opam PR)


P.S. I’m surprised by the number of users (or rather, dissatisfied users :–) of the CommonMark renderer. If you are using it don’t hesitate to tell how/why you are using it in this thread, just curious :–)

13 Likes

Thanks for the release!

I’m a (happy so far) user of the Commonmark renderer, but I guess that don’t really count as it is for a forked version of cmarkit…I use it to allow the slipshow users to render the slipshow source (a heavily markdown-inspired syntax) into valid markdown, if they need.

1 Like

Thanks for the feedback. The reason I’m asking is because I have the feeling that you can often do without.

In this particular example:

Why don’t you parse with locations and extract your user’s own prose from the absolute location offsets found in the AST ? Source layout preservation has its limitations which in turn leads to known discrepancies on rendering. As a user I think I’d rather see my own input so as not to be disoriented.

1 Like

Yes, that would probably be a better way of doing it…

In this particular case, due its to low-priority I don’t think I’m going to change it. Unless if many people complain about the lack of source preservation, which I hope it won’t happen: this is meant as a compatibility hatch, and is really just a “side-feature”.

We use it for example in builder-web to render README files. We do some light processing to adjust heading levels so they fit in the document they are emitted into Making sure you're not a bot!

The fact that the renderer has the ?safe argument allowed us to remove quite some code when migrating from omd.

1 Like

Cool, thanks for the links. But just to be clear you are not using the rendering to CommonMark or are you ? (but I do see a nice use of the AST Mapper)

Note, as written in the documentation I don’t vouch the safety here :–) There are likely better tools to sanitize the HTML outputs. Maybe I should have called that mostly_safe, this was mostly cargo culted from cmark --safe.

You’re right! I got confused by the terminology. Indeed, we use html renderer to render the parsed CommonMark documents.

Thanks for the heads up! I opened an issue in our repository for this. I don’t think it’s worse than our handwritten “sanitizer” we used for OMD (we wrote tests for what we sanitize, and Cmarkit passes them). So far the input markdown is all from repositories we control so it’s not been a high priority to be more precise :–)

1 Like

If you write some that do not, I’m happy to update Cmarkit to handle them. Given that safe suppresses raw html I think the only remaining problem is links. Their unsafety is currently asserted with this function.

1 Like