Help on the problem "Sorting a List of Lists According to Length of Sublists"

Hello everyone. I’m a beginner in Ocaml and recently I ran into a problem when I was doing the “Sorting a List of Lists According to Length of Sublists” exercise on the official website.
Below is my current code:

let frequency_sort (lst : 'a list list) : 'a list list =
  let aux (lst : 'a list list) : (int * 'a list) list =
    let sizes = List.map List.length lst in
    let frequency lst =
      List.map (fun x -> List.filter (fun y -> y = x) lst |> List.length) lst
    in
    List.combine (frequency sizes) lst
  in
  List.sort (fun a b -> Int.compare (fst a) (fst b)) (aux lst)
  |> List.map (fun lst -> snd lst)

The code runs as expected, But when I want to explicitly annotate the frequency as 'a list -> int list:

let frequency_sort (lst : 'a list list) : 'a list list =
  let aux (lst : 'a list list) : (int * 'a list) list =
    let sizes = List.map List.length lst in
    let frequency (lst: 'a list) : int list =
      List.map (fun x -> List.filter (fun y -> y = x) lst |> List.length) lst
    in
    List.combine (frequency sizes) lst
  in
  List.sort (fun a b -> Int.compare (fst a) (fst b)) (aux lst)
  |> List.map (fun lst -> snd lst)

The function frequency_sort is now inferred as int list list -> int list list:

Can anyone explain to me why this happens? And how to make frequency_sort polymorphic again when I explicitly annotated the frequency?

1 Like

First, beware that your sort function is accidentally quadratic.

Concerning your issue, what happens is that the (unification) type variable 'a is shared in the whole definition of the frequency_short function. Thus, in your line

let frequency (lst: 'a list) : int list =

the type variable 'a is the same that in the line above

let frequency_sort (lst : 'a list list) : 'a list list =

and when you apply frequency to the int list returned by sizes, the typechecker deduces that you meant 'a = int which implies that the type of frequency_sort is int list list -> int list list.

You can fix this by using a different type variable for the annotation on frequency.

let frequency_sort (lst : 'a list list) : 'a list list =
  let aux (lst : 'a list list) : (int * 'a list) list =
    let sizes = List.map List.length lst in
    let frequency (lst: 'b list) : int list =
      List.map (fun x -> List.filter (fun y -> y = x) lst |> List.length) lst
    in
    List.combine (frequency sizes) lst
  in
  List.sort (fun a b -> Int.compare (fst a) (fst b)) (aux lst)
  |> List.map (fun lst -> snd lst)

But it is generally more robust for debugging annotation to use locally abstract types rather than type variables:

let frequency_sort (type a) (lst : a list list) : a list list =
  let aux (lst : a list list) : (int * a list) list =
   ...
    let frequency (type b) (lst: b list) : int list =
 ...

Here the syntax frequency (type b) has the advantage of introducing a fresh abstract type b which is local to the scope of the frequency function and cannot be unified with any other types.

Often, this methods allows to catch mistakes far earlier. For instance, writing

let id (x:'a) = x + 1

makes the typechecker deduce that 'a = int whereas writing

let id (type a) (x:a) = x + 1

raises the error

Error: The value x has type a but an expression was expected of type int

4 Likes

Thank you very much for the detailed explanation :smiley: