How possible is a clippy-like linter for OCaml?

Hi !

I’ve been doing a bit of Rust recently, and I have to admit I’ve been quite astonished by Clippy, the main linter for the language.
I think having such a linter for OCaml could really be useful to new users as well as to more experienced OCaml developers. As importantly, it would constitute a database of OCaml best practices, which are often quite difficult to assemble.

The ppxlib / compiler-libs that exist and are already quite good would make it much easier than it would have been a few years ago to develop such a linter and I believe its integration would be quite easy in ocaml-lsp/dune. The main pain points would be:

  • Defining ocaml best practices (for example, JaneStreet have their own style, the stdlib has a different one…)
  • Actually implementing it. That’s one project I would love to take on, but I’m already overwhelmed with work and there are projects I contributed to in the past and haven’t had time to continue to that would probably be a priority for me. I realise this project is a huge task, but I do believe Clippy has an important part to play in beginner’s impression when it comes to Rust.
  • Reducing the amount of lint scream when using legacy code which is bound to not follow the best practices. But I believe that’s something that dune could provide easily.
  • Version management for the lints (some lints would work in newer versions but not olders).

When I say linter, I don’t say “just” style. OCamlformat is very good and already fixes most of the styling issues that one might have, but clippy also provide guidances such as:

The amount of linting messages in Clippy and variety of it is quite impressive : ALL the Clippy Lints
And quite importantly, once again, a lot of lints are semantic and not stylistic, e.g.:
blacklisted-names
almost-swapped

One example that comes to mind is the fact that I keep finding List.concat (List.map f x) in some old parts of the code I’m working on, and that could easily be flagged. Some more tricky hints like “This could probably use Seq instead of List” would also be interesting if at all possible.

Before writing this post I’ve looked tried to look for a similar initiative, but haven’t been able to find anything.

5 Likes

I actually thought about creating this kind of linter by myself. I think it will be very helpful for beginner students to study OCaml, because they are making same c+±like code and I’m tired to review it.

I’m not too familiar with dune, so I have a few questions/remarks:

  1. Should it be executed during compilation or in another pass, like inline_tests are implemented?
  2. I think in a nearest future we will need an access to type information, so it should be supported somehow.
  3. In long-run we will need libraries (that provide some kind of DSL) to export linter suggestions for this DSL. It will allow us to generate suggestions like concat + map => concat_map without hardcoding it into linter.

There have been a few efforts in this direction for OCaml code. For instance:

  • ocamllint, a now-deprecated binary (implemented as a PPX) that applied a fixed set of linting rules. (e.g. the example in the README warns about List.length _ = 0.)

  • bene-gesselint, a proof-of-concept linting PPX (by @NathanReb) that seems to be a spiritual successor to ocamllint. This one supports custom linting rules using Ppxlib’s Ast_pattern, which is a neat trick and (in my opinion) leads to quite clean rule specifications. The repository has an example of adding a constant folding rule for addition.

  • ocp-lint, an attempt at an all-encompassing OCaml linter, with support for linting rules at different levels of analysis (both the parse tree and the typed tree). The support for reasoning about type information unlocks a lot of possibilities (e.g. checking that hashtables have randomisation enabled is expressed here), but obviously adds a lot more complexity to rule specification. Sadly, it looks like this hasn’t seen recent work.

  • ppx_js_style, a PPX that enforces a small number of (relatively minimal) linting rules for Jane Street style.

3 Likes

Dune has a lint pass already IIRC, so that’s already cool :slight_smile:

That’s also a pain point it is true. One should probably query Merlin at this point, because otherwise a lot of the logic would be duplicated.

That’s really interesting ! What comes to mind is eslint (there’s probably something more recent though), which lets you write your own plugins. But it does leverage the dynamicity of JS doing dark magic that cannot (and probably should not) be done in OCaml (or any language for that matter).

Re @CraigFe, I had found 3 of them but did not look much closer because ocamllint and are either deprecated or seem abandoned, and ppx_js_style does not really cover the need. But not bege-gesselint does seem quite interesting ! However, it has one downside: implementing it as a ppx. ppxs interact nicely with dune, and let you auto-correct code perfectly, but one thing it doesn’t do is “nice error messages”. Basically what I would want from my linter is to generate a list of diagnostics where Diagnostic.t is

  type t =
    { range : Range.t
    ; severity : DiagnosticSeverity.t Json.Nullable_option.t
    ; code : code_pvar Json.Nullable_option.t
    ; codeDescription : CodeDescription.t Json.Nullable_option.t
    ; source : string Json.Nullable_option.t
    ; message : string
    ; tags : DiagnosticTag.t list Json.Nullable_option.t
    ; relatedInformation :
        DiagnosticRelatedInformation.t list Json.Nullable_option.t
    ; data : Json.t option
    }

(which corresponds to : Specification).
A ppx will correct code but will not emit information regarding what it did, or at least not in a nice structured way. Unless one parses the stdout of the ppx as json/whatever and not ocaml ast ? Seems like a bit of a hack. That being said, it would be nice to have a ppx that also does the check (trivial to implement once a linter is done, if there are diagnostics, fail), so it could be put easily in the standard ocaml pipeline.

One thing I may add is: such a linter would require a very nice documentation (in the style of clippy or List of available rules - ESLint - Pluggable JavaScript linter, with detailed examples and explainations. But I also realise that Rust and JS communities are much bigger than OCaml and such efforts are often quite difficult to put together. So probably, one would need a simple workflow to go from rule to documentation, which also enforces good documentation.

1 Like

Ocp-lint seems to have such detailed docs. E.g. ocp-lint: plugins

True ! Lacking examples but that’s a bit of a detail, can be fixed.
It also shows that best practices are changing quite a lot, because one of the lints warnings is for not putting signatures in all-caps.

So there is some effort going on there, but not actively maintained…
I wonder what the best solution would be: start from scratch, or leverage current implementations ?