Trying to understand list pattern matching and the `::` operator

Hi everyone, I’ve been working through the OCaml CS3110 book, and I’m struggling to wrap my head around list pattern matching.

In the examples used, it shows how you can use the cons operator (::) to prepend elements to a list.

E.g.

let l = []

1 :: l 

Which will output a list of ints [1].

It then goes on to talk about how you can get elements out of a list by using pattern matching. From my understanding this is taking an element/condition (?), and returning a value based on what it “matches”.

So, this makes sense to me:

let l = []

match l with
| [] -> "empty"

But then if this list has some elements, and say you want to just return the head of the list, you can do:

let l = [1; 2; 3]

match l with 
| [] -> empty (* won't match *)
| h :: t -> h

I also understand that these are pattern variables which can be mapped to the element being matched.

But what I don’t understand is that a few lines previously I was told this is the cons operator which prepends to lists. But in this instance within the match statement it behaves differently.

I think I can understand what it’s doing: it takes [1; 2; 3] and puts 1 into h, and put’s the tail of the list into t. Is it just using the same syntax as list prepending, but actually has a different meaning here?

Yes. Constructors (such as :: or any sum type constructors) can be used both in an expression (where they are used to construct values) or in a pattern (within a match or similar constructs) where they are used to “destruct” values, ie match over their constituents.

Cheers,
Nicolas

3 Likes

When you use it to construct a value, it will prepend a single value to a list of values:

# let lst = 99 :: [1; 2; 3];;
val lst : int list = [99; 1; 2; 3]

When you use it do deconstruct a value by pattern matching, it will separate the list into the first item and the rest:

# let a :: b = [1; 2; 3; 4];;
val a : int = 1
val b : int list = [2; 3; 4]

Note that the last statement is a partial match, because it fails in case the list is empty (it has no first element then), so we get a warning:

Warning 8 [partial-match]: this pattern-matching is not exhaustive.
  Here is an example of a case that is not matched: []

So when using it to deconstruct a value, you will usually use it in a match statement:

# match [1; 2; 3; 4] with
| [] -> "empty"
| a :: b -> string_of_int a;;
- : string = "1"

It’s “sort-of” the same, but reversed. You can

  • have it on the right-hand side of an “assignment” (edit: I think it’s not really called an assignment) (let lst = 99 :: [1; 2; 3]) or use it anywhere as an expression to add an element to the beginning of a list, or
  • you have it on the left-hand side of an assignment (as in let a :: b = [1; 2; 3; 4]) or use it in any other pattern matching position if you want to separate the first element from the rest of the list.

Disclaimer: I’m still (somewhat) a beginner too, and not too familiar with the terminology, so take my post with caution. I hope it does help understanding the operator better though.

1 Like

I think this is where I was getting mixed up, I didn’t realize that it can be used in both. The destructing aspect of it makes a lot of sense.

Appreciate the reply :slight_smile:

Ahh these examples with it being used on the left-hand side of a let expression are so helpful, thank you. Makes a lot more sense to me now. Thank you!

Well, if you’re still a beginner, I’m not sure what that makes me haha :sweat_smile:

The terminology issue here is that :: is not really an operator, it’s a constructor. An operator in OCaml, something like + or ^ or &&, is basically just a function with a symbolic name, it takes arguments and returns a value. But a constructor can be used both to construct values and to deconstruct them. I’m not sure if you’ve already learned about the option type but it helps to understand that :: is the same kind of thing as Some, except it’s symbolic and infix (btw I think :: is the only infix binary constructor). And in this respect it differs from other infix operator-like symbols which are really functions.

1 Like

I recently stumbled upon differences between constructors vs functions in a different context:

I tried something like:

List.map Some [1; 2; 3] (* Doesn't compile. *)

Which doesn’t work, because Some is a constructor and not a function. There is a function some, which does the job though:

List.map Option.some [1; 2; 3]

Accordingly, as you said:

# ( + ) 10 55;;
- : int = 65
# ( :: ) 1 [2; 3; 4];;
Error: Syntax error

This shows that :: doesn’t really behave like the other infix operators.

And you can reuse the :: constructor, e.g. for non-empty lists to avoid unnecessary handling of [](which is also a constructor).

This isn’t OCaml, but the Mercury programming language has these as functions, so a list literal can execute arbitrary code:

:- module sillylist.
:- interface.
:- import_module io.
:- pred main(io::di, io::uo) is det.
:- implementation.
:- import_module string.

:- func [] = string.
[] = "end\n".

:- func [T | string] = string.
[X | S] = string(X) ++ ", " ++ S.

main(!IO) :-
    io.write_string([1, 2, 3, 4], !IO),
    io.write_string(["hello", "world"], !IO).

output:

1, 2, 3, 4, end
"hello", "world", end

Curious OCaml has a list of alternating types, which can be done with a list literal:

module Alternating = struct
  type ('a, 'b) t =
    | ( :: ) : 'a * ('b, 'a) t -> ('a, 'b) t
    | []
end

let rec dump_alt alist =
  let module A = Alternating in
  match alist with
  | A.[] -> ()
  | A.(a :: b :: rest) ->
    Printf.printf "INT: %d\nSTR: %s\n" a b;
    dump_alt rest
  | A.[a] -> Printf.printf "INT: %d\n" a

let () = dump_alt Alternating.[1; "2"; 3; "4"; 5]

where dump_altonly works for (int, string) Alternating.t

and I’m sure this can be done much less awkwardly…

let rec dump_alts f g =
  let module A = Alternating in
  function
  | A.[] -> ()
  | A.(hd :: tl) ->
    f hd;
    aux g f tl
and aux f g =
  let module A = Alternating in
  function
  | A.[] -> ()
  | A.(hd :: tl) ->
    f hd;
    dump_alts g f tl

let () =
  let f x = Printf.printf "INT: %d\n" x in
  let g x = Printf.printf "STR: %s\n" x in
  let h `b = Printf.printf "???: `b\n" in
  dump_alts f g Alternating.[1; "2"; 3; "4"];
  dump_alts g h Alternating.["a"; `b; "c"]