Type definition contains a cycle?

I’ve been trying to write a simple tree data structure, and cannot quite understand
what exactly constitutes a cyclic type definition:

This code works:

module KMap = Map.Make(K)
type 'a dir = 'a entry KMap.t
and 'a entry =
| Leaf of 'a
| Tree of 'a dir
and 'a t = 'a dir

While replacing the definition of 'a entry with
and 'a entry = ('a, 'a dir) Either.t

results in “The definition of dir contains a cycle”.

This is quite mystifying, since the (working) definition of entry is indeed isomorphic to ('a, 'a dir) Either.t

It is considered a cyclical type definition when both the recursively-defined types are ‘type abbreviations’ (i.e. type aliases) as opposed to new created types. Here’s a simple example:

# type t = u
and u = t;;
Error: The definition of t contains a cycle:
       u

And when we make them new types as opposed to abbreviations:

# type t = U of u
and u = T of t;;
type t = U of u
and u = T of t

I don’t see a good reason for this to produce an error - looks like a bug in the type checker to me. I tried out a more minimal example, and got the same error.

# module IntMap = Map.Make(Int);;
# type 'a dir = 'a entry IntMap.t and 'a entry = ('a, 'a dir) Either.t;;
Error: The definition of dir contains a cycle:
       'a entry

Per Yawar, that’s a cycle, too, yes? The constraint makes sense: you expand all type-abbreviations, and if you end up with an infinite expansion or an expansion that doesn’t terminate, you raise an error.

In the example, we might want for 'a dir to not get expanded: but abbreviations are not supposed to be special: at least, that’s how I’ve always thought they were to be treated.

Okay, I think I understand why the type checker disallows this after reading this.

Consider an even more minimal example…

# type 'a entry = ('a, 'a entry IntMap.t) Either.t;;
Error: The type abbreviation entry is cyclic

This is cyclic, as in self referential, but it’s not a cycle of abbreviations. As the OP states, this is isomorphic to the allowed type definition

# type 'a entry = Left of 'a | Right of 'a entry IntMap.t;;
type 'a entry = Left of 'a | Right of 'a entry IntMap.t

But the type checker treats IntMap.t and Either.t as abstract, it doesn’t know if it’s a type of variants, or if it is a product type… What if we had:

type 'a IntMap.t = 'a * 'a
type ('a,'b) Either.t = 'a * 'b

Then 'a entry would be isomorphic to:

type 'a entry = 'a * ('a entry * 'a entry)

The linked article describes why type abbrevations of the above kind are problematic. Since the type checker doesn’t know if IntMap.t and Either.t types are variants or products (or some other kind of type), it conservatively complains about types being cyclic.

1 Like

I’m not convinced that “infinite expansion” line of reasoning is works well.

Expanding 'a dir or 'a entry does not seem problematic for me, it’s the not expanding Either.t that’s the problem.

OK then, but why would the type checker treat Either.t as abstract?
the module signature exposes the usual variant definition, so it’s definitely
not ('a, 'b) either = 'a * 'b (which would make the entire definition cyclic, unfounded, and
doubleplusungood).

Back to the abbreviation vs. definition issue, it amounts to beta-expansion at the (non-abstract)
type level. I would have expected referential transparency here.

From a couple of sources, it seems allowing these recursive type abbreviations can lead to accepting ill-formed values:

https://caml.inria.fr/pub/docs/oreilly-book/html/book-ora209.html

If this mode may be useful in some cases, it tends to accept the typing of too many values, giving them types that are not easy to read.

https://ocaml.org/api/Lazy.html#TYPEt

Note: if the program is compiled with the -rectypes option, ill-founded recursive definitions of the form let rec x = lazy x or let rec x = lazy(lazy(...(lazy x))) are accepted by the type-checker and lead, when forced, to ill-formed values that trigger infinite loops in the garbage collector and other parts of the run-time system. Without the -rectypes option, such ill-founded recursive definitions are rejected by the type-checker.

But that’s it though, OCaml does not do a lot of computation at the type level.

You’ll have to ask the language designers and implementers for their own reasoning, but from my limited experience with implementing type checkers I would find it very weird to do beta-expansion (or delta-expansion) at the type level for a language like OCaml. It’s not trying to be a fully dependently typed language, so there won’t be a lot of benefit from complicating the type checker and wrecking error messages.

@yawaramin is right: Ocaml makes the distinction between type abbreviations

type 'a abbr = int * 'a

which create short-hand for type expressions, and are thus subject to the same restriction as ordinary type expressions and type definitions

type 'a def = A of 'a

which creates new types. The two are fundamentally different in term of equalities: a type abbreviation introduces a new equality between two type expressions ('a abbr= int * 'a) whereas a type definition introduces a new unique entity with no (known) equality with previously defined types.

Consequently, type abbreviations cannot create invalid type expressions. For instance,

type 'a t = 'a t list

is only valid when regular recursive type expressions have been allowed with the -rectypes option. Moreover, non-regular recursive type expressions, for instance

type 'a nr = 'a list nr list

are never allowed in OCaml ( (which corresponds to a type graph of infinite size even in presence of sharing) and yields this error message:

Error: This recursive type is not regular.
The type constructor nr is defined as
type 'a nr
but it is used as
'a list nr.
All uses need to match the definition for the recursive type to be regular.

For more information, you might be interested in looking at the difference between iso-recursive and equi-recursive types.

5 Likes

While the type checker does not expand Either.t, it does not treat it as abstract either. As mentioned by others, it knows that Either.t constructs a value, and as a result your type definition is allowed in -rectypes modes, which is of course sound (i.e., -rectypes is not unsafe).

So why is your definition refused by default?
This is mainly because allowing it makes type inference less useful at detecting mistakes.
Here is an example using -rectypes:

# let rec append l1 l2 = match l1 with [] -> l2 | a::l -> a :: append a l2;;
val append : ('a list as 'a) -> 'a list -> 'a list = <fun>

About 20 years ago OCaml used to have -rectypes as default for some time, but it was eventually decided the confusion caused by accepting such definitions was not worth the added expressive power.

Note that the trade-off is different for polymorphic variants and objects. There, recursion is allowed as long as it goes through either a polymorphic variant or object type.

# let rec append l1 l2 = match l1 with `Nil -> l2 | `Cons (a,l) -> `Cons (a, append l l2);;
val append :
  ([< `Cons of 'b * 'a | `Nil ] as 'a) -> ([> `Cons of 'b * 'c ] as 'c) -> 'c =
  <fun>

So you could actually write your code by defining Either.t as:

type ('a,'b) either = [`Left of 'a | `Right of 'b]

Note that inferring such equi-recursive types (as mentioned by @octhacron) also comes with some restrictions on the form of recursion allowed (only regular types are allowed, which is weaker than for explicit type definitions). But this is another story (while there does exist a stronger algorithm, it is extremely complex).

4 Likes

Sorry to revive an old discussion here, not sure where else to post. This seems like a flaw (even a bug) in OCaml.

All of the examples for why recursive types lead to confusion have to do with inferred types of functions or data structures. None of them are examples of where a type alias like type 'a entry = ('a, 'a entry IntMap.t) Either.t above can cause trouble.

But if you enable -rectypes to allow the non-problematic cases, you have to also enable all the bad cases from the examples.

Are there any examples of where in a type alias, allowing recursive types can cause a problem?

Motivation: I wanted to create a type like

type 'e gast = Int | Sum of ('e * 'e) | ...
type ast = ast gast

Which is equivalent to defining ast as type ast = Int | Sum of (ast * ast) | ... but also allows stuff like:

type mapper = ('a gast -> 'b) gast
fun map_ast : ('a, 'b) mapper -> 'a gast -> 'b gast

type 'a with_span = WithSpan of (int, int, 'a)
type spanned_ast = spanned_ast gast with_span

etc.

The problem is that you are introducing a distinction that does not exist.
Namely, a type alias is just an alias, strictly equivalent to the type where this alias has been fully expanded. So for it to be valid, the version without alias should be valid too.

I understand what you want, and I think many people would like to have that, including me. But this more or less amounts to introducing AI into the type checker, to let it decide which cycles are intentional and which are not.

A workaround here is to make ast a one-constructor polymorphic variant type:

type ast = [`Fix of ast gast]

Another common situation is that you may want to add extra information in ast nodes, and use an object type.
Of course, if you don’t care about the node being structural, you can also use a record type (which may be more comfortable actually).

1 Like

In these cases, what I like to do is use a single constructor type with unboxed:

 type t = Mk of t shape [@@unboxed]

This way, the memory representation of t should be the same as the idiomatic original definition, but you can also write functions parametric over nodes:

  val pp_shape: (Format.formatter -> 'a -> unit) -> Format.formatter -> 'a shape -> unit
  val compare_shape: ('a -> 'a -> int) -> 'a shape -> 'a shape -> int
  val op: 'a shape -> op
  val children: 'a shape -> 'a list
  val map_children: 'a shape -> ('a -> 'b) -> 'b shape
  val make : op -> 'a list -> 'a shape

I use this in ego to allow users to supply modules with relatively arbitrary term structures while still allowing building generic components on top of them (ego/lib/language.ml at master · verse-lab/ego · GitHub)

3 Likes