Syntax highlighting on GitHub

For OCaml, all the syntax highlighting on GitHub is done via this project:

It doesn’t appear as though this is a very active project however, so I am wondering if it would be worth it for the OCaml organization to take over and maintain this repo and eventually point github/linguist to the maintained one.

1 Like

I filed this in the Community page a couple months back (probably the wrong place to put it). But that also got no traction – I think forking and updating the grammar is a good idea.

Sidenote: it also looks like it’s possible to run tree-sitter in the browser. One can only hope…

I think they recently switched to the Tree-sitter-based solution:

1 Like

That appears to be about code navigation rather than syntax highlighting however.

I have been researching this recently, and yes, the repo you point to is the one which is used by GitHub’s Linguist project for the grammar. Here it is vendored as a submodule:

linguist/vendor/grammars at master · github/linguist · GitHub

I’ve made a pull request to fix multiline strings, however I’m not holding my breath for it to be merged:

Support for quoted strings a-la {|hello|} by keleshev · Pull Request #17 · textmate/ocaml.tmbundle · GitHub

A much better solution would be to make a pull request to Linguist to point to a different up-to-date grammar instead. Fortunately, there is one in the vscode-ocaml-platform project:

vscode-ocaml-platform/syntaxes at master · ocamllabs/vscode-ocaml-platform · GitHub

Their grammar format is slightly different (JSON instead of plist), but according to my research it is supported by Linguist too.

Now, it’s a matter of following the steps in the Linguist documentation to point to the new repo and run the checks that they have, fix any issues and submit a PR to the Linguist, but I haven’t gotten to it. Any volunteers?

Also, the folks @ocamllabs should be consulted, because once this is done, a change in their highlighting grammar will be automatically applied (at regular intervals—a couple of times per year, I reckon) to the whole GitHub source code, and that puts a bit of pressure on them.

1 Like

Github uses tree-sitter grammars for some languages, which would be possible for OCaml too with better results than the textmate grammars. But they don’t have a public process to add new tree-sitter grammars.

1 Like