Idea: OCaml Symbol Glossary

TLDR: There should be an OCaml glossary of syntax used by advanced features, especially for punctuation-based syntax that is inherently hard to google.

When I’m working through OCaml code someone else has written, whether its source code for a library or application or something being discussed in a blog post, there’s a scenario I always dread, and that I tend to find myself in somewhat regularly (although less and less over the years): there’s some line noise special characters in the code that are impossible to google, and I can’t figure out what they do.

Off the top of my head, this has happened to me in the past with

  • (type t) in a parameter list → locally abstract type
  • 'a. → explicitly polymorphic type annotation
  • type t. → explicitly polymorphic locally abstract type (!)
  • -'a | +'a → covariant / contravariant type annotation
  • !'a → injectivity type annotation
  • type _ foo → definition of a GADT
  • | (* pattern *) -> . → refutation case in match statement
  • .. | += | 'Foo | [> | [< → stuff for extensible variants
  • := → tbh I forget was this one does and it’s hard to google, thus proving my point. I think it does something with removing a type equation in an opened module.

And I may be forgetting some! In fact I haven’t included any of the weird notation that isn’t valid OCaml code but that often shows up when something doesn’t typecheck and/or in the toplevel, nor have I included any infix operators, both of which cause similar issues.

My proposal is that the official OCaml documentation (either the manual or somewhere on ocaml.org) should prominently feature a glossary page that includes all of these unsearchable symbols, explicitly names them, and links to a page explaining their usage. Technically the OCaml Language section of the OCaml manual provides something like this, but it’s really not written in a way that it would be useful for beginners, even if they knew to look there; as it says at the start of the section: “A good working knowledge of OCaml is assumed.” Stuff is written out in BNF-style grammar rather than code examples, the semantics are explained at the bottom within paragraphs of prose rather than in any structured format, and there’s not really any links to corresponding parts of the much more useful to a beginner An Introduction to OCaml section.

In my opinion, OCaml’s reliance on special characters for syntax extensions is a much bigger barrier to entry than most experienced users would realize—check any relevant forum and you’ll find lots of posts of the form “what does %syntax do in OCaml?” And yet some of these are still hard to reliably google, especially if you’re a beginner and you don’t know even basic syntax by name yet. Something like this would be really useful for a lot of people, and wouldn’t be that hard to write, especially for an MVP.

As a postscript, I also think it would be very useful if there was an index that matched plain-language descriptions of issues you might encounter in OCaml and advanced features that might solve them; I only learned about the with t := feature when browsing through the Language Extension section of the manual, but can think of a couple of times it would have come in handy in the past if I’d only known it existed. But that’s harder to write and so beyond the scope of this post.

12 Likes

This is a nice idea. The ReScript people have done something similar: Syntax Lookup | ReScript Documentation

@chshersh also had an idea for a tool that would take any piece of OCaml code as input and output an explanation of everything inside it.

Side note,

Not necessarily. This can be used anywhere we don’t need to refer to the type variable inside the definition, e.g. a phantom type like type _ foo = int.

2 Likes

I’ve been using the Operator Lookup by @CraigFe. Maybe this could be incorporated in the official OCaml page?

I support the idea! :100:

TIL about the “Operator Lookup” website.

To get some inspiration from the prior art, you can have a look at the Haskell version of this operator cheatsheet:

1 Like

Note that there is already a glossary of keywords in the manual: OCaml - The OCaml Manual but it has not been curated for a long time and is missing any non-keyword symbols.

I also agree that it would probably help to have a separate glossary that points towards the “introduction” part of the manual rather than the language description.

Love this idea and this would make a great resource for OCaml.org!

If you can open a PR for this, even if incomplete, that will go a long way, and contributions to this are very welcome!

2 Likes

On such a page, please do not succumb to vanity and use so-called “programming ligatures”.

We can see this on the earlier linked ReScript page. Even on a page whose purpose is to explain the concrete syntax of the language, they could not resist showing fancy arrows etc instead of the actual syntax!

4 Likes

I probably won’t have time to do this until the end of finals season (tbh wrote this whole post in a big burst of procrastination) but if no one else does it until then I’ll probably have time over winter break. Might even see if I can write up a plain language description of fight you’re having with the type system → advanced feature that might be helpful page while I’m at it.

2 Likes