AST tracking of type variables in polymorphic let-bindings (an old question)

Hi all,

I’m interested in obtaining a version of the typed AST, where the constructors Texp_let and Tstr_value carry the list of type variables that are introduced for these constructs, when the body admits a polymorphic type.

Back in 2009, I implemented a quick patch to the compiler for that. Since then, the compiler has evolved, and I would like to rebase my work on a recent version of the compiler.

I’ve already asked the question a few years ago, but things may have evolved, hence I’m asking the question again: Does the current compiler, or a branch thereof, implements the tracking of type variables? If not, would there be interest in integrating this feature in the typechecker?

Note: At a high level, the implementation consists in opening a fresh scope for every let-binding, registering in the current scope fresh type variables (in newty2, when the type variable is not associated with any structure), and when closing the scope gathering the duplication-free list of type variables that have been generalized (t.level = generic_level), and storing that list in the let-constructor of the typedtree (e.g. Texp_let(rec_flag, fvs, pat_exp_list, body), where fvs denotes that list). The patch is less than 100 lines.

If you think of complications related to GADTs, or other advanced features, please let me know.

Thanks
Arthur

Salut Arthur, and sorry for the slow answer.

I am not aware of a branch of the compiler doing that, but there are several persons working on projects that are bound to do something similar:

  • Pierrick Couderc mentions the problem in his thesis, and typechecker for typedtree was rebuildling scopes.
  • Guillaume Claret’s coq-of-ocaml clearly needs that too.
  • And my recent ocaml_in_coq branch, which is another translation from typedtree to Coq, will need to do that eventually (it currently does it on the fly, but this will certainly break at some point)
  • I also remember that Ryohei Tokuda and Eijiro Sumii had a similar problem in their “Specialization of Generic Array Accesses After Inlining”

Seeing the demand, I would think that merging this would be a good idea.
As to the problems that may occur, the typechecker does generalize in many places, so I’m not sure your strategy is always suifficient, particularly in principal mode. generalize_structure does not concern type variables, but the distinction is not always clear, so I’m also thinking of making a clearer separation between the different roles of levels.

Thanks Jacques for your answer. It would be great indeed if you could contribute to one correct implementation! Let me know if I can help somehow, e.g., for testing the feature.