Combining ocamlformat & refmt

Hey everybody,

Since I am currently trying to integrate Reason syntax support into odoc, I had some ideas on how to improve the intersections between both communities.

First of all, with the odoc Reason integration, it will be generally possible to generate documentation from .re / .ml files to any target language (currently provided as a odoc --lang parameter). Sadly I am not able to keep the same formatting, so I thought ideally I would use a pretty-printer for each language instead.

Currently we have two ways to pretty-print, which is using refmt for Reason and ocamlformat for OCaml.

In the current state, refmt is able to print OCaml code as well, but it’s very ugly :frowning:.
So my idea was to incorporate ocamlformats pretty printing logic in the reason-cli tooling.

This would make it much easier to switch from OCaml to Reason and vice-versa within one tool.

So I have some questions:

  • Do you actually like the idea?
  • Reason is using migrate-parse-tree datastructures… so it should be compatible with ocamlformat, right?
  • There is a Base dependency on ocamlformat… I am not entirely sure if this will cause any compatibilty problems with Windows?

Would be happy for any feedback / ideas.

Cheers

Base works fine on Windows.

1 Like

I like the idea of being able to emit doc pages where the doc is in a reason syntax, but I’m not sure how much I like the idea of depending on refmt (or ocamlformat for that matter).

Here’s why:

  • I’m a bit wary of the idea of introducing new dependencies (there already are many, which we’ll eventually have to remove): it makes it potentially harder to update to newer versions of the compiler. Which also makes it less likely to become the documentation generator used for/on the compiler distribution itself. [1]
  • All the “code” that odoc currently pretty prints is a subset of the signature language. It’s not clear how much one needs ocamlformat/refmt to do a decent job there. [2]
  • All the “code” odoc pretty prints is not just pretty printed, it also has a fair amount of markup. Given that the formatting tools you suggest simply produce text, it seems somewhat painful to have to re-lex, -parse and -interpret them to add the proper markup while preserving the layout emitted by the tool.

All in, I think it’s just way simpler to pretty print the signature items directly from odoc, rather than using another tool to do it.

[1]: Of course it’s always possible to do the OCaml pretty printing directly from odoc, and to have the reason output be both optional and done by refmt.
[2]: It seems like ocsigen is more fancy than what I would have expected. They also reformat the blocks of codes present in docstrings. That’s fancy.

Thanks for your feedback @trefis!

I’m a bit wary of the idea of introducing new dependencies (there already are many, which we’ll eventually have to remove): it makes it potentially harder to update to newer versions of the compiler. Which also makes it less likely to become the documentation generator used for/on the compiler distribution itself. [1]

Yes, I totally agree… I would prefer not to introduce too many dependencies if possible… though a lightweight cross-community lib for formatting in both directions could definitely be a benefit.

All the “code” that odoc currently pretty prints is a subset of the signature language. It’s not clear how much one needs ocamlformat/refmt to do a decent job there.

I was more talking about embedded code examples in docstrings though… we can still print the parsed Compilation_units by hand, but we can’t do that for those Code_blocks, since they are just plain text.

All the “code” odoc pretty prints is not just pretty printed, it also has a fair amount of markup. Given that the formatting tools you suggest simply produce text, it seems somewhat painful to have to re-lex, -parse and -interpret them to add the proper markup while preserving the layout emitted by the tool.

Maintaining the markup implementation for both languages will probably be quite painful in the future, and I also think that client-side syntax highlighter in javascript do a decent job as well, in case we just render in strings (it’s a tradeoff in performance, but at least the codebase would be much much easier to read).

I will continue my work just with the reason dep for now (to render Code_blocks), and we can decide later with the prototype PR if this is an absolute no-go (if you don’t mind). For the signatures, I will continue building the markup by hand.

In the current state, refmt is able to print OCaml code as well, but
it’s very ugly :frowning:.
So my idea was to incorporate ocamlformats pretty printing logic in
the reason-cli tooling.

This would make it much easier to switch from OCaml to Reason and
vice-versa within one tool.

Duplicating one tool into another is setting things up for difficult
maintenance, I’d avoid it if possible.

Some time ago ocamlformat was used to convert Reason to OCaml code,
which worked by communicating the parsed form of the Reason code to
ocamlformat. This was some time ago and probably needs updating, but
sounds easier to do than any more invasive merging of tools.
Ocamlformat_reason
would be a place to start looking if you’re interested.

But that said, I share @trefis’s hesitation about using it for doc
generation.

Just had some very good input on the Discord chat (@rizo).

Maybe it will be better to just leave this out for now and think about the plugin system first, which is not yet implemented in odoc yet.

Then we will only have signature translation for now, which should be a good start.

What is this supposed to be ?

It’s the support for ocamldoc’s custom tags in odoc. I’m not sure if there are any existing plans/designs for this feature in odoc though (I don’t see any issues on GitHub).

I agree with the concerns of directly adding more dependencies to odoc. Having a custom tag could be a clean way to implement support for translation of examples to Reason. But this will have to wait of course.

Me neither but it seems to me we should rather get rid of that feature. There are nowadays enough other ways to annotate the syntax tree.

What was your precise idea here ?

@dbuenzli My idea is to annotate the code blocks with a tag that would trigger an execution of a build-time plugin for translation of OCaml examples into Reason (or the other way around).

val string_of_int : int -> string
(** [string_of_int i] is a string representation of an integer value [i].
    
    @code.ocaml {[
      assert (string_of_int 42 = "42");
      assert (string_of_int 0 = "0");
    ]} *)

The @code.ocaml tag would be picked up by the generator plugin to (optionally) translate the OCaml example into Reason. The main benefit is the fact that the translation logic can leave separately from odoc itself.

How do you think attributes could be used to implement this? I guess something like the following could work:

val string_of_int : int -> string
(** [string_of_int i] is a string representation of an integer value [i]. *)
[@@code.ocaml (
  assert (string_of_int 42 = "42");
  assert (string_of_int 0 = "0");
)]

Is this what you had in mind?

@dbuenzli Do you think this approach could be used to implement runnable examples as well? One nice benefit of using attributes is the that the content has to be valid OCaml (ie, it’s not just a string) and the text editors will apply syntax highlighting/linting to it.

1 Like