Records, tuples and variants subtyping

Why doesn’t OCaml support subtyping for tuples?T hat also seems to be the case for records, tuples and variants.

Is that a language design choice or some type system limitation?

Example:

let f (a, b) = a + b in
let a = f (1, 2, 3)
(* or *)
let b = ((1, 2, 3) :> (int * int))

For tuples and records, the runtime will access a field at a statically known position. For example, a reference is a record defined as type 'a ref = { mutable contents : 'a }. The function (!) : 'a ref -> 'a will return the item found in the first field of the block (field 0). So we can’t have a value of type { foo : int; contents : 'a }, cast it to an 'a ref, and then pass it to (!) because the contents field of our value is physically the second field, but the fst function would still read the first field.

This would prevent casting tuples to arbitrary slices, but shortening the tuple from the right should be possible. I don’t know if there’s a technical limitation for this. I would guess it’s just not that useful.

For classic variants, the problem is similar. The runtime knows about an int that represents a specific case. For the type A | B | C, A is 0, B is 1, and C is 2. If an object B of such type were to be cast to B | C, a function that expects this new type would assume B is 0 but we’d pass it the physical value 1, which doesn’t work.

These problems are solved with objects (of the OO kind) and polymorphic variants by representing the field or tag with a hash of the original name. For polymorphic variants, this doesn’t change much other than the ints representing tags are no longer consecutive. For objects, an extra indirection is required to map the hashed field ID to the physical field in the block representing a particular object. This is my coarse understanding of the problem. I haven’t studied how this is really implemented.

I know about the memory model, but should be simple to be able to extend things to the right

Sure but you could extend ref only, doing subtyping like inheritance, so that you keep the same memory model, like { ...ref, foo: int }

Same for variants, | super | C would be okay, while | C | super wouldn’t.

And both of them are really useful as now you can have functions that handle a subset of both, for records and tuples you would be able to get additional data inside something like an AST and for variants your functions could handle only a piece of the AST each time.

1 Like

I also like the idea of having “tuples with labeled fields”. Combined with “extension to the right” as you describe, this could be a useful form of inheritance while remaining efficient.

(I’m not suggesting this for ocaml, just finding it interesting)

I asked Pierre Weis this question many decades ago. I asked: SML has the syntax #1 (I think it was) that projects the first member of a tuple, of arbitrary length. Or #f that does the same thing for a record (projects the field f). Pierre pointed out that the expression
fun x -> #1 x has no principal type. In order to type that expression, you have to coerce the variable x. This means that (and I found then and find now this to be incredibly compelling) when you have a program that doesn’t type-check, it’s possible that you might need to add coercions to get it to type-check (in SML). Whereas, in caml-light (and hence, in OCaml minus polymorphic recursion, objects and polymorphic variants, IIRC), that isn’t true: if you have a program that doesn’t type-check, adding coercions won’t make it type-check; maybe adding coercions will help you discover why it wont’ type-check, but once you figure it out and fix the problem, you can remove all type-coercions.

I think that that’s accurate. I could be incorrect. It’s been a long time.

3 Likes

That is surprising. Here, the function has type ('a, 'r) -> 'a (with 'r denoting a row variable), which looks like a principal type to me. More generally, I don’t remember having ever read that row polymorphism breaks principal typing. But perhaps some other feature makes it difficult, if not impossible, to have row polymorphism in OCaml.

This was back before Didier Remy added objects to make it OCaml. I lost track of the type theory (went to make a living in the salt mines) pretty much after modules got added, so I can’t say for sure whether row polymorphism would break principal typing or not. But the unicity of inferred types is … a real boon, and I would be loath to give it up.