I’m not sure whether this already exists somewhere, but I’ve never seen it, so here I go.
In the eternal quest of offsetting as many issues from runtime to compile time, I’ve always encoded the fact a list cannot be empty in its type. Historically as something akin to type 'a list_nonempty = 'a * 'a list
, which does the job decently: you get a compile time guarantee that the list has at least a head instead of asserting that List.hd
is not None
at runtime because you “just know it can’t be empty”. It also enforces clearly in the signature that an empty list cannot be passed. The drawback being that the type differs from a standard list, so client code must be adapted and becomes a bit more convoluted around the first element, but I had accepted that as the cost of such type safety.
More recently I toyed around the idea of using a GADT to enable using the actual list literal syntax and constructor for pattern matching:
module List_nonempty = struct
type ('a, 'empty) maybe =
| ( :: ) :
'a * ('a, [< `Nonempty | `Empty ]) maybe
-> ('a, [ `Nonempty ]) maybe
| [] : ('a, [ `Empty ]) maybe
let hd = function
| hd :: _ -> hd
let tl = function
| _ :: tl -> tl
let rec to_list = function
| [ last ] -> Caml.List.[ last ]
| hd :: next :: tl -> Caml.List.cons hd (to_list (next :: tl))
type 'a t = ('a, [ `Nonempty ]) maybe
end
This makes using literal syntax possible, List_nonempty.(hd [0; 1; 2])
has type int
as expected, List_nonempty.([ 0 ] |> tl |> hd)
is a type error as expected, etc. However the type differs entirely from List.t
, while with the previous solution at least the tail was a standard list. This made operating on the rest of the nonempty-list a breeze, and to_list
was very efficient (just prepend the head to the tail). Here because the ::
constructors differ, to_list
allocates an entirely new list, which may be unacceptable performance-wise. I had a mixed feeling about it.
Earlier today however, it struck me that since OCaml has become much better at resolving the correct constructs when it has a full type constraint on an expression, maybe I could get a :: b :: _
to resolve to two different ::
constructors in a single expression. Actually it should work out of the box … and lo and behold, this completely trivial and much simpler solution entirely does the trick:
module List_nonempty = struct
type 'a t = ( :: ) : 'a * 'a list -> 'a t
let map (hd :: tl) ~f = f hd :: List.map ~f tl
let hd = function
| hd :: _ -> hd
let tl = function
| _ :: tl -> tl
let to_list = function
| hd :: tl -> List.cons hd tl
end
This has all the static typing aforementioned properties, while being essentially transparent syntax-wise:
# let l = List_nonempty.[0; 1; 2];;
val l : int List_nonempty.t = List_nonempty.(::) (0, [1; 2])
# List_nonempty.hd l;;
- : int = 0
# List_nonempty.tl l;;
- : int list = [1; 2] (* the tail is a standard list *)
# List_nonempty.(hd (tl l));;
Error: This expression has type int list
but an expression was expected of type 'a List_nonempty.t
And of_list
is trivial and performant.
No more List.hd |> Option.value_exn
without significantly complexifying the codebase !