Refactoring tools for OCaml? Type-based refactoring?

aryx · July 27, 2021, 7:56am

Hi,

What are the current tools to help refactor OCaml code?

I have a very heavy refactoring where I previously defined an AST like
type expr =
| Int of int
| String of string
| Plus of expr * expr

and I need to add some extra information at each node, like
type expr = { e: expr_kind; id: int }
and expr_kind =
| Int of int
| String of string
| Plus of expr * expr

That means old code like
match e with
| Int i → …
must become
match e.e with
| Int i → …

and each time I return an expression, I must now encapsulate it with a constructor function.
For example
match stuff with
| Whatever → Int 2

must now become
match stuff with
| Whatever → mke (Int 2)

where let mke ekind = { e = ekind; id = gensym() }

There are lots of files involved, so I’d rather use a tool that can help automate some of it.

aryx · July 27, 2021, 7:57am

Right now the only solution I see is to abuse the LSP server to identify all
the places programmatically and combine that with some emacs kungfu (macros)
to help automate most of it (and then rely on the typechecker to manually tune things).

talex5 · July 27, 2021, 9:02am

There’s the rotor tool, which was demoed at last year’s OCaml workshop and looked very promising (API migration: compare transformed (OCaml 2020) - ICFP 2020).

However, it only supported old versions of OCaml when I tried it, and there doesn’t seem to have been any recent activity.

aryx · July 27, 2021, 9:35am

It looks mostly restricted to renaming though? Could it perform the refactoring
on ASTs I presented above?

Yaron_Minsky · July 28, 2021, 12:28am

I know of nothing that you could use. Internally at Jane Street, @ceastlund has built some internal tools that do what you suggest: basically run a build, observe build errors, and apply transformations based on the returned errors. But I don’t know of any generally available versions of said tools.

nojb · July 28, 2021, 5:39am

As others have said, I don’t think there is anything publicly available for this kind of thing. When we have had to make large-scale refactorings like this at LexiFi, we have sometimes written ad-hoc automatic tools to help us. It is not as hard as one would think

The easiest case is when the transformation can be phrased in purely syntactic terms. In this case, you can write a tool using compiler-libs that parses each file, obtaining the corresponding Parsetree. Then you perform two passes. First, then you walk through the Parsetree, keeping track of the the places where you need to change something: in you situation, it would be the match scrutinee and the final expression of each match case. Second, you do a “rewriting” pass where you use the information that you obtained in the previous pass to textually insert the new code fragment in the original source files.

If the transformation cannot be phrased in purely syntactic terms, then you need to combine it with reading the .cmt files which contain the type-annotated Typedtree in order to find the places that need rewriting, and it is a bit more involved, but the overall logic remains the same.

Of course, it is so easy to do this kind of refactoring “by hand” just by following the compiler errors that the investment of writing this kind of tools is only worth it if you have a large codebase to refactor. It was worth it for us at LexiFi (~600k LOC), but for more reasonably-sized codebases the tradeoff may be less clear.

Cheers,
Nicolas

UnixJunkie · July 28, 2021, 7:08am

Even a tool to detect and remove dead code from sources would be useful.
I.e. you want to release some code from a prototype: everything not reachable
from main() should be removed.
For the mental sanity of people who will try to read the code later…

nojb · July 28, 2021, 8:01am

For global dead code elimination (but note this is not exactly the same as “everything not reachable from main() should be removed”), I know of

GitHub - rescript-association/reanalyze: Experimental analyses for OCaml/ReScript: for globally dead values/types, exception analysis, and termination analysis.
GitHub - LexiFi/dead_code_analyzer: Dead-code analyzer for OCaml (not currently being worked on; would need to be adapted to work with the latest OCaml),

Cheers,
Nicolas

Topic		Replies	Views
[ANN] First releast of Rotor: a multi-file refactoring tool Community announce , refactoring	9	1864	February 3, 2020
[ANN] ocamlmig, a tool to rewrite ocaml code, and complement `[@@deprecated]` Community announce	11	590	March 5, 2025
An AST typing problem Learning	7	2887	November 2, 2020
Taking a look: some open OCaml Pull Requests Community	2	943	February 7, 2023
Compiler Plugins? Ecosystem	13	1921	July 1, 2019

Refactoring tools for OCaml? Type-based refactoring?

Related topics