Is it possible to generate type definitions using expressions?

Suppose, I want to expand

type t = A [@len 2] 
[@@deriving abc]


type t = A0 | A1

This is straightforward using ppxlib since 2 is a known constant. However, I was wondering if it’s possible to generate type definitions based on expressions:

let x = Int.of_string "2" (* some arbitrary expression *)
type t = A [@len x] 
[@@deriving abc]

I am able to generate let bindings. But type declarations appears to be tricky.

No, this is not possible, at least at compile time. You can do something like this though:

type t = ..
module type S = sig
  type t += A
let gen_constructors : int -> t list = fun len ->
  let rec helper i = 
    if i >= len then []
    else (
      let (module M : S) = (module struct type t += A end) in
      M.A :: helper (i + 1)
  helper 0
let constructors = gen_constructors (int_of_string "2")

This is essentially generating a bunch of constructors at runtime. Not sure how useful this is though since you no longer have exhaustiveness checking in pattern matching.

EDIT: This might be possible with a restricted set of expressions (i.e. ones that don’t need other context) since the ppx could just evaluate the expression but this also seems like a bad idea.

Could you please elaborate on the last part?

ppx could just evaluate the expression

For an expression

Int.of_string "2"

the AST gets expanded to

            expression (//toplevel//[1,0+8]..[1,0+21])
              Pexp_ident "Int.of_string" (//toplevel//[1,0+8]..[1,0+21])
                expression (//toplevel//[1,0+22]..[1,0+25])
                  Pexp_constant PConst_string("2",(//toplevel//[1,0+23]..[1,0+24]),None)

Can ppx turn this expression to (at compile time):

expression (//toplevel//[1,0+8]..[1,0+9])
            Pexp_constant PConst_int (2,None)

Imagine a “const” declaration in OCaml. It would only allow a limited set of functions to be used, and you’d have to be able to compile the code of the const declaration as a separate program that would run the declarations and print out their values, to be re-parsed by the PPX rewriter and then pasted back into the program. Then you could use such “const” names in your type-declaration. You might also allow const declarations to import const declarations from other modules: you might use PPX attributes to carry the values to which those decls evaluated, so they could be used by importing declarations.

Seems complicated, and non-intuitive things might happen due to module-visibility differences between the const-evaluation mechanism and the regular compiler, but then that already sort of happens with ppx_import I guess. Anyway, it should work I guess.

I think, we can generalize it a bit.

A constant expression is a production with constants as terminals in AST.

Only the high-level constexpr node need to be annotated. The PPX will recursively traverse down the expression graph to ensure that it results in a constant node (consistent with Ocaml AST)

This would allow use cases such as

[%const Int.to_string (Int.of_string 2)]

Not trivially as far as I’m aware. Are you able to share a bit more motivation as to why you want to do this? If you’re restricted to constants known at compile time, why not just “manually” evaluate the constant (as in your first example)?

1 Like

Memory mapped I/O can take arbitrary address locations across different hardwares. Types provide type safety. By generating typed memory addresses, I can reduce the chances of bit overflows and other memory addressing errors.

that might be easier to do with an abstract type, that is an integer but that hides it is one, and you check the bounds in the one function that build a value of this type.

There still is an unsafe function to write, but you safety everywhere else.

1 Like

I have tried that. It sort of works. But, it’s not elegant. Fundamentally, I think parameterized type generation using constant expression, is a useful compiler feature with potential benefits. Perhaps we can propose its incorporation in the standard ocaml compiler?

Executing constant expression would be a very big feature. I like what c++ did with this but this is complicated and has drawbacks.

I think the closest thing there is is BER MetaOCaml

There are no plans to integrate metaocaml with upstream that I am aware of, but there is this project : Modular macros which is slightly more limited in scope, but has less drawbacks and whose authors (I think) want to push upstream eventually.

1 Like

I don’t understand the benefit in this case. If you don’t know before compile time what variants inhabit your sum type, how is the sum type useful in writing the program? On the other hand, if you know when writing the code how many variants there are, then you don’t need evaluation of constant expressions etc.

Could you give some example of how you expect to use a sum type with constructers you don’t know about when writing the program?

As mentioned previously, the use case you’ve described seems like a goodfor for abstract types, imo (e.g. as in Lightweight Static Guarantees).

Types generated by one package X, can be exposed via modules to user-facing package Y. As an example, see janestreet/ppx_variants_conv

The idea is used for generating hardware abstraction libraries

Ah, interesting. So wouldn’t this mean the “same” program (the same source code with the same dependencies) would compile correctly on one system but fail to compile, with a type error, on a different system?

I don’t understand the relevance of this comparison. ppx_variants_conv generates functions to manipulate values of specified types. It does not generate types of some unknown shape based on the system it is compiled on.

1 Like

Except that ppx_variants_conv deals with types that are known at compile time (e.g., case = 2 in the original post).

In project A, I can generate a type t in module B, and then include B.t in any other module inside project A. I get the benefits of pattern matching (e.g., missing cases).

I can apply ppx_variants_conv on the generated type to derive whatever properties that I want e.g.,

type t = A [@range x] [@@deriving generator, variants]

Derivers allow precedence. So, generator can be applied before variants. Generator will expand the type to create A0 … Ax, and then variants will do whatever it needs to.

The trick is that generator ppx will forward the @@deriving variants (and other remaining derivers in the list) to the generated type.