Ergonomically adding type annotations to functions for debugging

Hello, folks!

Sometimes, when I’m writing a bigger function and I can’t figure out where my type errors are coming from, I like to add explicit type annotations to see where my code creates problems for the type checker as such:

let rec pow_mod (x : int) (y : int) (m : int): int =
  if y <= 0 then 1
  else if y mod 2 = 0 then pow_mod (x * x mod m) (y / 2) m
  else x * pow_mod x (y - 1) m mod m

Now this works like a charm but is quite cumbersome (adding lots of parentheses and jumping around a lot).

Alternatively, one could write it like this

let rec pow_mod : int -> int -> int -> int =
  fun x y m -> ...

which makes type annotations much nicer to add when needed and remove later: one needs to only insert code after the identifier name and can easily mark that whole section for deletion. Unfortunately this is quite ugly (in my opinion), so I would still need to rewrite that section to remove the fun x y m -> part and integrate the arguments into the let binding.

Ideally I would like to just add a typehint like this

val pow_mod : int -> int -> int -> int
let rec pow_mod = ...

which sadly is not possible to my knowledge (val is only available in .mli files)

So I have two questions:

  1. If you have any tips on how to achieve something similar more seamlessly that would be great!
  2. I find the idea of toplevel type annotations with val foo : bar -> baz in normal .ml files as an alternative to inline type annotations let foo (x : bar) : baz = ... very useful. Are there good reasons for why this is not possible? Could this perhaps be a new feature that might improve ergonomics?
1 Like

I’m not sure if this is a common pattern, but one option to do this “in-line” as in within the same file is to put your type signatures into a module type (signature) and apply it to a module that you include at the top level. Here’s an example:

module type S = sig
  val pow_mod : int -> int -> int -> int
end

module Pow_mod : S = struct
  let rec pow_mod x y m =
    if y <= 0 then 1
    else if y mod 2 = 0 then pow_mod (x * x mod m) (y / 2) m
    else x * pow_mod x (y - 1) m mod m
end

include Pow_mod
1 Like

As for why you can’t apply top-level signatures within the same file, I assume that this was an intentional design decision done to either aid separate compilation (and minimize reparses) or else to aid documentation. FWIW, I find it nice to have “header files” as a first-class language feature without the negative implications that come with that in C.

2 Likes

Fwiw, I’ve come to the opinion that it is actually quite beautiful, and I find that it has a nicer economy expression than the the redundant Haskell/Hope approach. I like to write this style like so

let rec pow_mod
  : int -> int -> int -> int  
  = fun x y m ->
    if y <= 0 then 1
    else if y mod 2 = 0 then pow_mod (x * x mod m) (y / 2) m
    else x * pow_mod x (y - 1) m mod m

and I read it along these lines:

Let pow_mod be a recursive function from int to int to int to int, defined as a function from x, y, and m to …

This only answers your opinion with my own, but my suggestion here is just to learn to appreciate the elegance of these expressions on their own terms :slight_smile:

IMO, the very good reason is that we don’t need it, since we can already capture everything the val based syntax would give via the inline method discussed here or the module signature method suggested by @bsidhom. Adding another way of writing the same thing wouldn’t add any expressiveness to the language, but would further complicate the syntax for reasons that are purely superficial.

In case you want the annotations to be separate from the implementation using the module signature approach, but you don’t want to actually seal the interface of the module (because sometimes that is inconvenient when you just want to assert the types of the values for validating your intentions), you can do something like

module Pow_mod = struct
  let rec pow_mod x y m =
    if y <= 0 then 1
    else if y mod 2 = 0 then pow_mod (x * x mod m) (y / 2) m
    else x * pow_mod x (y - 1) m mod m
end

module _ = (Pow_mod : sig
  val pow_mod : int -> int -> int -> int
end)

You can do something similar if you just want to do a quick annotation of a function for verifying your intentions but you don’t want to mess with the way you are actually declaring the function:

let rec pow_mod x y m =
  if y <= 0 then 1
  else if y mod 2 = 0 then pow_mod (x * x mod m) (y / 2) m
  else x * pow_mod x (y - 1) m mod m
let _ = (pow_mod : int -> int -> int -> int)

This also has the benefit of letting you repeat the function name, in case that sparks joy :smiley:

6 Likes

Actually, I hadn’t thought about it before, but if you’re trying to define type signatures in-line like this, you are very likely working in the context of a single executable file with no module dependencies. In other words, if you haven’t wired everything together yet, then you will get compiler warnings due to pow_mod not being used. This technique solves both the typing and warning at once. :slightly_smiling_face:

2 Likes

Nice idea, but unfortunately this significantly pessimizes error messages since it makes the type checker infer the type of the function and then unify it with the expected type, rather than directly checking the function against its expected type.

let f : int -> int =
    fun _ -> "a"
             ^^^
Error: This expression has type string but an expression was expected of type
         int

vs

module type S = sig
    val f : int -> int
end
module F : S = struct
    let f _ = "a"
end

Error: Signature mismatch:
       Modules do not match:
         sig val f : 'a -> string end
       is not included in
         S
       Values do not match:
         val f : 'a -> string
       is not included in
         val f : int -> int
       The type int -> string is not compatible with the type int -> int
       Type string is not compatible with type int

(There is (afaik) no technical reason why OCaml couldn’t use the module signature to check the function immediately against its expected type in the second case and produce a good error message, but that’s just how the compiler works)

IMO, the direct let f : int -> int = ... syntax is by far the best option since it produces much better error messages than all the other ones (including let f (x : int) : int = ...)

3 Likes

This summarizes quite well the different approaches I have experimented with after reading this so far, which all have their upsides but ultimately are not quite satisfactory to me.

The main reason I want to add type-annotations is for the typechecker to supply helpful error messages even if the function i am currently working on is not yet well-typed. I want the typechecker to know which type to expect and then report what it found instead. I’m stealing the example of @welltypedwitch here:

let f : 'a -> int =
  fun _ -> "a"
               ^^^
Error: This expression has type string but an expression was expected of type int

This gives a pleasant error message reporting the problem right away. The downside to this however is that the conversion between the upper form and the standard way of writing such function via

let f _ = "a"

requires some extra editing steps which I find a bit cumbersome and was hoping to avoid.

As pointed out by @welltypedwitch, the other approaches simply fail to provide similar quality error messages. Which is very disappointing since the

let f x = x + 1
let _ : int -> int = f

approach is very elegant and allows me to simply delete the typechecking line once everything is well-typed.

This, of course, is just a minor issue and it takes just a little more time to type things out to get it working like a charm. I do however think that this is in fact a quality of life improvement that might be worth looking into, assuming I’m not the only one adding type annotations for debugging purposes. Being able to quickly add and remove explicit type annotations to an expression might be something LSP could help with but optimally the language should have an easy construct for doing so. This is one of the few things I miss from Haskell.

So thank you all for your input regardless, it’s good to see there are many ways to achieve this!

1 Like

For the part about editing after removing a type annotation, note that if you have e.g.

let rec pow_mod : int -> int -> int -> int =
  fun x y m -> ...

and then delete : int -> int -> int -> int and run ocamlformat, it will give you e.g.

let rec pow_mod x y m =
  ...

This is not to say that some more integrated two-way editing couldn’t be done better by editor tooling.

1 Like

It does not with the latest version. That’s because the two syntax used to be parsed to the same node, but starting with OCaml 5.2 the difference is represented in the AST, so a ppx or internal optimization could do different things in each case, and so the semantics of the two syntaxes are in principle different, even if in almost every case different.

Objects type-checking errors are sometimes quite impenetrable and maybe a variation of this can trick can help.

class some_klass = object (self)
  initializer 
    let _ : int -> int = self#f in
    ()

   method f _ = "a"
end
1 Like