Can I get some type inference?

Ocaml can of course do some type inference, but awkwardly often I have to annotate argument types, and I wonder if there’s some coding practice that would cut down on this.

type wit = { weight: int; index: int }
type lit = { length: int; index: int }
type lwt = A of wit | B of lit

let show_wit a = Printf.sprintf "index %d, %dg" a.index a.weight
let show_lit a = Printf.sprintf "index %d, %dcm" a.index a.length

let show_lwt p = match p with
    | A a -> show_wit a
    | B a -> show_lit a
;;
print_endline (show_lwt (B {index = 1; length = 10}))

… And ocaml says about that:

| let show_wit a = Printf.sprintf "index %d, %dg" a.index a.weight
                                                            ^^^^^^
Error: This expression has type lit There is no field weight within type lit

Of course, from my point of view this was a rather obtuse analysis. This expression has a field “weight”, therefore why “infer” it was lit, and not wit which does have the required fields?

Very often these ill-inferences feature records with a common field like “name” or “index”, and perhaps that’s my error - i.e. Ocaml doesn’t really fully support that.

There’s also missing inference when a record is used prior to the place where its type is clearly established. E.g.,

let upw w = { w with weight = w.weight + 1 }

let uw i = let (wit: Witdef.wit) = { weight = i ; index = 0 }
    in upw wit
;;
print_endline (string_of_int (uw 50).weight)

… And ocaml says,

1 | let upw w = { w with weight = w.weight + 1 }
                         ^^^^^^
Error: Unbound record field weight

I’m obliged to explicitly declare (w: Witdef.wit), even though the it’s clear at some point in the compilation that uw calls upw with a record of that type.

If the alternative involved some way to write upw so that it would function with all records having a field weight of an integer type, I wouldn’t have a problem with that.

See When should I use single case constructors or type aliases? - #22 by yawaramin

One way to resolve the issue is to write show_wit using pattern matching instead:

let show_wit {weight; index} = Printf.sprintf "index %d, %dg" index weight
1 Like

Yes, that could be useful. I wish I could attach a parameter to it, as one does with match clauses for example - match a with | w@{weight = 50} -> w.index - to see how the inference reacts to let show_wit w@{weight}, but the syntax isn’t working in that context. Not that I’d have much use for it anyway.

It looks to me like it may work because type inference is more reliable on the left of the =, In the body of the function it arbitrarily picks the 2nd of two functions with index fields and informs me that the other field is wrong; in the declaration, it looks at both fields and makes a reasonable inference.

(As opposed to some kind of polymorphism where the function works with any {weight: int}. If weight is common to two records, we’re back to arbitrarily picking the 2nd.)

In OCaml this is written using as:

match a with {weight=50} as w -> w.index

This kind of polymorphism would involve a drastic change in the compilation scheme, which currently resolves field accesses to offsets at compile time: a.index and a.weight are compiled to something like a.[1] and a.[0].

It’s possible to write functions that are polymorphic over fields in this way using objects instead of records, if you really need to:

# let show_wit' a = Printf.sprintf "index %d, %dg" a#index a#weight;;
val show_wit' : < index : int; weight : int; .. > -> string = <fun>

Here the type < index : int; weight : int; .. > means that the object has to have index and weight methods and might have some other methods, too.

It should work, let bindings use the same pattern syntax as a single match case.

let show_wit ({weight; _} as w) = Printf.sprintf "index %d, %dg" w.index weight

Over 30yr ago, I asked Pierre Weis why this was done in (at the time) caml-light. He gave me what I thought was a completely satisfactory answer. It went like this:

Caml-light (and OCaml’s core language – not using objects or polyvariants) guarantees that for any particular piece of code, it is either well-typed, or it cannot be well-typed. If it is well-typed, then the inference algorithm will infer a most-general type for each bit of your code. KEY POINT: there is no third option of “the program is well-typed, if only I could add just the right type-annotations to help the typechecker along!!!”

Now consider that bit of code:

fun {w: weight} -> w

If there are two record-types (t1 and t2) with a field w:weight, then there is no most general type for this function; instead, there are two different types, t1 -> weight and t2 -> weight. There is no most general type.

By contrast, SML/NJ actually has (or had 32yr ago) a type of “records containing a field w of type weight”, so you can write that function in SML/NJ.

Pierre pointed out that by making this choice, caml-light delivered something else also: in a well-typed program (typeable by the caml-light inference machinery) you never need to add type-annotations to get your program to typecheck. Either it typechecks without annotations, or it doesn’t. period. This is what sold me on the caml-light approach to polymorphism. B/c I’ve experienced (in SML/NJ) that situation where I’ve got some code that I think is type-correct, but I can’t quite coax the typechecker to accept it. So I wander around, adding type-annotations, trying to convince the typechecker that it’s type-correct. Eventually I give up and start rewriting the actual code. With caml-light, you never do that annotation bit, except to help you figure out what the types of the various bits of the code are: they’re never necessary to help the typechecker do its job.

Concretely [IIRC] the way that caml-light achieved this, is that when a record type {w:weight, I : index} is defined, it -generatively- creates the field-names “w” and “l”, and associates them with this type. If you then define another type {w: weight, j : arglebargle}, it again generatively creates the field-names “w” and “j”. And the previous field-name “w” is no longer available. In OCaml I think some changes have been made to this, but it’s late and I’m not going to try to remember exactly what’s different. But the basic idea remains, I think: you never have to add type-annotations to get your core OCaml (that is, no objects, no polyvariants) program to typecheck.

2 Likes

I believe SML should fail here. It doesn’t have row-polymorphism and it needs a principal type for records just as OCaml. What SML offers for records over OCaml is that they can be anonymous (they’re structurally typed), but not polymorphic!
Perhaps SML/NJ had a row-polymorphism extension?

What’s different (I believe it’s not in caml light?) is that when the type inference algorithm is sure (from previous context) that an ambiguous field is one type and not another, it disambiguates automatically without needing your annotations. So in some specific cases, you can retrieve the shadowed w without doing any type annotations. (see section 1.4.1 in the manual)

Perhaps SML/NJ had a row-polymorphism extension?

It used-to-be, back in the day, you could write #f which was the function that extracted the field named f. IIRC it wasn’t typeable all by itself, but in a context where its argument was a type which had a field named f, it was well-typed. There were other things like that, that SML/NJ did.

So your guess might be correct.

oh yeah, for selectors it’s special-cased and is indeed the same across implementations. but record selectors aren’t really first-class functions, and functions can’t be written to take the role of a “generic” selector (you can’t define fun f r = #f r), they don’t have a canonical type in SML, and so they require annotation or context to work…
to have them be first-class functions, row-polymorphism is required. then they’d be like ocaml methods, where let f r = r#f has the proper most-general type <f : 'a; ..> -> 'a

1 Like