Converting typing information from one OCaml version to a later one?

It would be convenient to have a tool that can convert OCaml typing information, in particular Env.summary and possibly Subst.t, from a given version of OCaml to a later version of OCaml. (Ideally, whole .cm{i,o,t,ti} files could be converted this way.) Intuitively this should be possible as new versions rarely remove information or make things inexpressible, and it could be useful for many tooling authors, who could consider supporting one recent version of OCaml and converting older artifacts on the fly.

Has someone tried to do this before? Does someone know that it is possible, or that it is blocked by some fundamental issue? (I wonder in particular if the Merlin people have thought about it; cc @let-def @trefis , and also maybe @ejgallego experiments .)

The question comes from a discussion with @hackwaly of how to have ocamlearlybird support multiple versions of OCaml in a robust/maintenable way.

1 Like

I have a need for exactly this feature.

My interest is to make easily available my patched compiler that offers “improved” type error messages. There are numerous possible approaches, but the one that seems to be realistic to maintain on the long term, and that could be easily exploited from any OCaml project, would be to offer a tool that is able to:

(1) take as input a given .ml file containing a type error,
(2) read the .cmi files associated with the dependencies—these .cmi files could be generated by any version of the compiler,
(3) generate a type error using my patched compiler—where my patch remains based on a older version of the compiler.

The interest of this approach is that I don’t need to update my patch on every version of the compiler, yet it would be available to process any .ml file that does not exploit cutting-edge technology. This would allow non-expert programmers to benefit from alternative, possibly more informative, type error messages, without the constraint of compiling the full code base using an out-of-date compiler.

2 Likes

Interesting @gasche , only use case I could have is for a tool that tries to actually convert OCaml type definitions to TypeScript type definitions, but indeed not sure if such a conversion would be useful.

I also heard just today of coqffi which apparently works by inspecting .cmi files, and may also benefit.

1 Like

This would also allow typed ppxlib. As the typed PPX would be able to read the environment from an old OCaml version on the new typechecker version.

For me reading the cmi is enough, so I implemented it on ocaml-migrate-types, it can do 411 → 412 types, will implement 410 → 411, which is probably painful because of the Uid

https://migrate_411_412.ml

edit:

410 → 411 was also implemented, a bit concerned with the type_separability thing
https://migrate_410_411.ml

edit2:

408 → 412 ready

edit3:

This can probably be used on the CMT, by doing Untypeast → Type(using latest). I wonder if that would allow merlin to support older versions.

edit4:

Was able to run the OCaml typechecker for 4.12 on 4.08, and convert the CMIs with success to type code. More testing is needed, but hopefully the transformations are correct and we can then use it to load CMT.

3 Likes

That’s great @EduardoRFS ! Note the open issue [feature request] support migration of `types.mli` · Issue #52 · ocaml-ppx/ocaml-migrate-parsetree · GitHub , it seems to me that your work would solve it.

@EduardoRFS looks promising, thanks!

  1. Any chance you could be convinced to also deal with Env.summary and possibly Subst.t, so that we can port all the debug information? (This opens the door to porting complete bytecode programs, the rest can be done without touching the type-checker data.)

  2. In your code I see some %identity externals for type that has not changed. Is it really worth it? It looks like you save some code (but then there is already a lot of boilerplate-y code, so I’m not sure if expanding the identity there would be much worse), but it adds fragility in the system and requires human intervention. I think for maintenance it would be better if this trick was not used.

  1. Yes, I can do it. This weekend I can do it, if I forget feel free to send me a message.

  2. This identities are not only for types that didn’t change, they deal with some types who are private on the types, like the Maps and unboxed status, I can probably fix them, was just being lazy, as I did the tool to generate + the generated files in a couple of hours.

The code was generated, and just fixed later by hand, like what is done by ocaml-migrate-parsetree, I didn’t knew at the time there was this tool so I wrote another one. But I know about a couple of bugs, that needs to be handled.