Why Do Simple Let Bindings Introduce Extra Type Assertions?

Brief

I was just surprised to discover that

let x : int = 4 in x

actually produces the AST I would’ve expected from

let x : int = (4 : int) in x

This was a bit jarring, as the PPX code I’m writing attempts to move a subexpression in such a way that this extra type annotation is a problem. Does anyone have any idea why this is happening? I’d be reluctant to manipulate this in my code if I didn’t understand the motivation. I’m seeing this behavior on OCaml 5.0.0 as well as OCaml 4.13.1.

Detailed

More specifically, consider the following ocaml session:

$ ocaml -noinit -dparsetree
OCaml version 5.0.0
Enter #help;; for help.

# let x : int = 4 in x;;
Ptop_def
  [
    structure_item (//toplevel//[1,0+0]..[1,0+20])
      Pstr_eval
      expression (//toplevel//[1,0+0]..[1,0+20])
        Pexp_let Nonrec
        [
          <def>
            pattern (//toplevel//[1,0+4]..[1,0+11]) ghost
              Ppat_constraint
              pattern (//toplevel//[1,0+4]..[1,0+5])
                Ppat_var "x" (//toplevel//[1,0+4]..[1,0+5])
              core_type (//toplevel//[1,0+8]..[1,0+11]) ghost
                Ptyp_poly
                core_type (//toplevel//[1,0+8]..[1,0+11])
                  Ptyp_constr "int" (//toplevel//[1,0+8]..[1,0+11])
                  []
            expression (//toplevel//[1,0+4]..[1,0+15])
              Pexp_constraint
              expression (//toplevel//[1,0+14]..[1,0+15])
                Pexp_constant PConst_int (4,None)
              core_type (//toplevel//[1,0+8]..[1,0+11])
                Ptyp_constr "int" (//toplevel//[1,0+8]..[1,0+11])
                []
        ]
        expression (//toplevel//[1,0+19]..[1,0+20])
          Pexp_ident "x" (//toplevel//[1,0+19]..[1,0+20])
  ]

- : int = 4

Observe the Pexp_constraint in the output. Given that I never wrote a type assertion in an expression, I didn’t expect to see a Pexp_constraint here.

Of note: more complex patterns don’t appear to exhibit this behavior. For instance,

let (x : int, y) = (4, 5) in x

produces a tree with no Pexp_constraint nodes.

I looked over parser.mly in the OCaml repository and didn’t see anything there that would explain this behavior, but I’m not deeply familiar with that code base. Any ideas what prompts this behavior or whether it’d be a problem if I hacked in a mechanism to remove the (hopefully extraneous) type assertion nodes?

This has been simplified in 5.1, which explains why you don’t see it in parser.mly.
With 5.1,

let x : int = 4

produces now a value constraint stored in the value binding node itself

Ptop_def
  [
    structure_item (//toplevel//[1,0+0]..[1,0+15])
      Pstr_value Nonrec
      [
        <def>
          pattern (//toplevel//[1,0+4]..[1,0+5])
            Ppat_var "x" (//toplevel//[1,0+4]..[1,0+5])
          core_type (//toplevel//[1,0+8]..[1,0+11])
            Ptyp_constr "int" (//toplevel//[1,0+8]..[1,0+11])
            []
          expression (//toplevel//[1,0+14]..[1,0+15])
            Pexp_constant PConst_int (0,None)
      ]
  ]

This value constraint is then translated to what amount to a constraint on both the expression side and pattern side, but there are no longer duplicated constraint nodes in the AST.

The motivation for this behavior is that the type annotation is useful both on the pattern side to disambiguate constructors or fields, and on the expression side for GADT to disambiguate escaping type variables.

Consider for instance,

module M = struct
  type _ t = X:  int t
end
let x = M.X
let X: _ M.t = x

without the annotation on the pattern side, we cannot disambiguate that X is M.X.

Similarly, for GADTs, in

let f: type a. a -> a M.t -> a = fun x X -> if Random.bool () then 0 else x

the annotation is needed to know that the return type is supposed to be a and not int.
Moreover, there is also the fact that polymorphic annotation like

let f: 'a . 'a -> 'a M.t -> 'a =
  fun (type a) x (X:a t) ->  if Random.bool () then 0 else x

rather belongs to the pattern side.

(And the previous behavior was also quite complex for binding level coercions).

4 Likes

Thanks for the thorough reply! I’ll definitely be keeping it as a reference. The refactoring you describe sounds immensely convenient to me, so I’ll look forward to updating my project to use OCaml 5.1. :slight_smile:

Dear octachron,

This example doesn’t type check, and I can’t figure out what you meant to write. Otherwise great answer!

Fixed to a valid if artificial example

let x = M.X
let X: _ M.t = x

Thanks!