Which kind of recursive types are allowed?

I learned that:

type r = (int -> 'a) as 'a

doesn’t typecheck without -rectypes. I’m trying to grasp what recursive types exactly do typecheck, and why.

Everybody knows you can do:

type tree = Leaf of int | Node of tree * tree

so creating a recursive type this way is allowed.

You can’t do:

type 'a rw = R of 'a
type r = (int -> 'a rw) as 'a

but this works:

type 'a rw = [`R of 'a]
type r = (int -> 'a rw) as 'a

So what is the rule here?

From the manual:

If the type variable ident actually occurs in typexpr, a recursive type is created. Recursive types for which there exists a recursive path that does not contain an object or polymorphic variant type constructor are rejected, except when the -rectypes mode is selected.

Thank you. Why are cycles allowed with polymorphic variants, but not with regular ones?

Adding more confusion… Is an error expected when the second type keyword is replaced by and?

# type 'a r = [`R of 'a] type 'a t = 'a r as 'a ;;
type 'a r = [ `R of 'a ]
type 'a t = 'a constraint 'a = 'a r

# type 'a r = [`R of 'a] and 'a t = 'a r as 'a ;;
Error: The definition of t contains a cycle:
       'a r as 'a

In this code snippet:

type 'a rw = [`R of 'a]
type r = (int -> 'a rw) as 'a

I think that the first line is not a type definition, but a type alias (like type t = int). So in the second line, this alias is expanded to the following, which contains a cycle.

type r = (int -> [`R of 'a]) as 'a

Edit: I can’t read, this is incorrect

Consider the translation of your ADT definition to a polymorphic variant:

type tree = [ `Leaf of int | `Node of 'tree * 'tree ] as 'tree

The recursive ADT definition leads to a recursive polymorphic variant type expressions. Thus if we want polymorphic variants to be as expressive as regular variants we need to allow recursive polymorphic variant types.

1 Like

@emillon that example is valid though

@octachron wouldn’t the argument work backwards too?

  • why don’t we have to write:
    type tree = (Leaf of int | Node of 'tree * 'tree) as 'tree
    
    ?
  • why can’t we write:
    type tree = [ `Leaf of int | `Node of tree * tree ]
    
    ?
1 Like

This works too, it is another way to write the same recursive type expression.

This is not well typed: aliases like _ as 'x works on type expression.

| Leaf of int
| Node of ...

is a type definition not an expression. Such type definition introduces a fresh type name which gives us an easy way to define recursion by going through this new name. In particular, recursion can only be achieved by expanding the new mutually recursive types. Recursive type expressions can be more unwieldy:

let f (`X x: [ `X of [`Y of 'b * 'a | `Z ] as 'b] as 'a) = x

Ok, I think I see it now:

  • recursive type definitions like type t = (* something with t *) are valid;
  • in type expressions recursion is usually not allowed, like in (int -> 'a) as 'a, for these reasons;
  • it is allowed if the recursion involves polymorphic variants (or objects, but let’s ignore them here)

And the reason why poly variants need the recursion for the tree example, while normal variants don’t, is that:

  • if I use Leaf and Node, the typechecker will infer type tree;
  • on the other hand, with `Leaf and `Node the inferred type is [ `Leaf of int | `Node of 'tree * 'tree ] as 'tree,

so the inference algorithm needs to “generate” a recursive type for the poly variant, while it already knows that the regular variant is recursive.

Is this right?

Indeed polymorphic variants are structural types (aka type expressions), thus a recursive function like

let rec proj x = match x with
| `Leaf leaf -> leaf
| `Node (left,_) -> proj left

needs to be able to decide that the type of x and the type of left are the same. Thus the type system needs to be able to handle equations inside the type expressions. Contrarily with a nominal variant, it is the type definition itself that introduces the type equations

type tree =
| Leaf of int
| Node of tree * tree

and we know in advance that those equations only needs to be used when building or deconstructing a value of type tree. (And since the recursion is more constrained, type definions can handle non-regular recursive types like

type 'a balanced =
| Leaf of 'a
| Node of ('a * 'a) balanced

)

1 Like

Perfect, thank you for the insight (and patience!). Framing the issue as structural vs nominal makes much sense.

I didn’t know that polymorphic variants can’t express non-regular recursion, but I think I’ll leave this topic for another day :slight_smile:

An educational side benefit of -rectypes is that it lets you play with the pure untyped lambda calculus: Lambda calculus: Encoding datatypes.