When should I use single case constructors or type aliases?

Examples of “single case constructor”:

type id = Id of int

type person = Person of {
  name: string;
  age: int
}

Alternatively, I can use type aliases:

type id = int

type person = {
  name: string;
  age: int
}

When to use which?

2 Likes

In both the examples, you are not using the id type after defining it. Did you mean to use it as a member of the record types?

Note that person is not typically considered a type alias; since it actually defines a new type (there is no way to use a record type without defininig it previously).

The first one, id, is indeed a type alias because t and int can be used intercheangeably in any context. Thus, your question makes sense for it. Using a one constructor sum type for it helps you avoid accidental confusion between t and an unrelated int in your program.

Cheers,
Nicolas

2 Likes

It’s used in the hex library so that you can keep track of which string is the raw array of bytes and which is the hex-encoded one:

Sometimes you can use a private type alias (instead of a one-constructor sum) for a similar purpose.


The other use for it that I know is when you need to rely on physical equality to distinguish values.

let x = 5
let y = 5
let () = if x == y then print_endline "eq" else print_endline "neq"

type id = Id of int
let a = Id 5
let b = Id 5
let () = if a == b then print_endline "eq" else print_endline "neq"

This prints eq and then neq.

The reason is that Id 5 allocates a new block (it’s a constructor so it constructs!!) whereas 5 is just a literal for an immediate value.

1 Like

I would be surprised if it does what you say outside the toplevel. Sharing immutable constants that are structurally equal is something that even the bytecode compiler does, so I’m expecting eq and eq except in bytecode mode with debugging on.

Indeed! I had tested it by calling ocaml with a file having the content above, but compiling and then running prints eq and eq.

You can wrap the construction inside some complicated enough code that the compiler won’t be able to share these constants.

let b = ref true
let f v = if !b then v else assert false
let x = f 5
let y = f 5
let () = if x == y then print_endline "eq" else print_endline "neq"

type id = Id of int
let g v = if !b then Id v else assert false
let a = g 5
let b = g 5
let () = if a == b then print_endline "eq" else print_endline "neq"

The compiler is very clever and can see through your tricks :slight_smile:

$ ocamlopt -config-var flambda
true
$ ocamlopt -O3 test.ml -o test && ./test
eq
eq

What about making it blind with:

let g v = Sys.opaque_identity (Id v)

You would likely prefer Id (Sys.opaque_identity v).

Your suggestion would prevent the compiler from evaluating a == b to true at compile time, but it would still evaluate to true at runtime.
The reason is that Sys.opaque_identity acts as a barrier between its evaluated argument and the rest of the world. It doesn’t prevent any optimisation from occurring on its own argument. So after inlining you would get:

let a = let v = 5 in Sys.opaque_identity (Id v)
let b = let v = 5 in Sys.opaque_identity (Id v)

Which is equivalent to:

let v_a = 5
let arg_a = Id v_a
let a = Sys.opaque_identity arg_a
let v_b = 5
let arg_b = Id v_b
let a = Sys.opaque_identity arg_b

This gets optimised into:

let shared_const = Id 5
let a = Sys.opaque_identity shared_cont
let b = Sys.opaque_identity shared_cont
3 Likes

So I guess the advice is not
use constructors to make values physically different
but it should be

avoid constructors (or use [@unbox]) so the physical equality doesn’t depend on compiler optimisations

There are some situations where aliases are not sufficient, and so you must define a new type. The following is impossible (without the -rectypes option):

type 'a infinite_list = 'a * 'a infinite_list
(* Error: The type abbreviation infinite_list is cyclic *)

But this is accepted:

type 'a infinite_list = L of 'a * 'a infinite_list
1 Like

id and person are two examples of “single case constructor” vs type alias.

The person case is not considered a type alias: records in OCaml are nominative, meaning that two records definitions with the same fields are not compatible:

type t1 = { x : int; y : int }
type t2 = { x : int; y : int }
let v1 : t1 = { x = 0; y = 1 }
let v2 : t2 = { x = 0; y = 1 }
if v1 = v2 (* <- type error *)
then ...

On the other hand, tuples are structural: int * int is the same type everywhere.

type u1 = int * int
type u2 = int * int
let v1 : u1 = (0, 1)
let v2 : u2 = (0, 1)
if v1 = v2 (* no error *)
then ...

That’s why t1 and t2 (and both definitions of person in your original examples) are not considered aliases, while u1, u2 (and the second definition of id) are considered aliases.

So my answer to your original question is to use single case constructors when you want to create distinguished types, noting that records already perform this so adding an extra constructor only adds syntactic burden in that case.
In non-record cases the extra constructor can have a small runtime cost, that you can get rid of using the [@@unboxed] attribute if performance is a concern.

5 Likes

Oh I see. Right. So in OCaml ‘traditionally’ the approach has been to use modules. You can see a bigger explanation here: Files, Modules, and Programs - Real World OCaml

Code snippet from there:

module type ID = sig
  type t

  val of_string : string -> t
  val to_string : t -> string
  val ( = ) : t -> t -> bool
end

module String_id = struct
  type t = string

  let of_string x = x
  let to_string x = x
  let ( = ) = String.equal
end

module Username : ID = String_id
module Hostname : ID = String_id

type session_info = {
  user : Username.t;
  host : Hostname.t;
  when_started : Unix.tm;
}

This allows you to ‘spin up’ as many ‘string ID’ types as you need. You can also create an equivalent ‘int ID’ module:

module Int_id = struct
  type t = int

  let of_string = int_of_string
  let to_string = string_of_int
  let ( = ) = Int.equal
end

And of course any other type you want that can support these operations. The advantage is that the actual modules you create to represent the domain IDs (Hostname, Username) come with built-in support operations.

6 Likes

This is often known as The newtype pattern where you create a completely new nominal type which is just a wrapper around an existing type.

It’s an amazing FP pattern! But if abused too frivolously it can become an obstacle.

There’s an excellent blog post on the subject that gives you an idea of when to use and when not to use it (it’s in Haskell but the main sentiment applies to OCaml as well).

2 Likes

I would usually prefer an abstract type, but the singleton variants can be useful when pattern-matching GADTs. Consider:

type id = Id of int

type _ value =
  | Bool : bool -> bool value
  | Int : int -> int value
  | Obj : id -> id value

An x : int value can here be pattern-matched with a single case Int. If type id = int, then the Obj case needs to be matched and rejected, and if the type id was abstract, this would also be the case when matching an x : bool value.

1 Like

You can achieve the same thing without needing to wrap int in the variant. Since the types on either side of each -> don’t need to be the same, the following code works the same way:

type id = private Id
type _ value =
  | Bool : bool -> bool value
  | Int : int -> int value
  | Obj : int -> id value
    (* or an abstract type instead of int *) 

You could also use a polyvar like Obj: int -> [`Id] value to avoid defining a new type altogether.

2 Likes

Yes, that can be an option, though when the parameter is used to constrain a universal type variable which occurs elsewhere in a signature, one might want it to carry a value.

Good point with the private type in either case, I think that’s as close as we can get to an abstract type.

1 Like

The newtype pattern has gone mainstream because you can even use it in Python now!

from typing import NewType

Id = NewType('Id', int)

def f(x: Id):
    ...

f(123) # Static type-checker error
2 Likes

Thank you for recommending the blog post by Alexis King. Names are not type safety and Parse, Don’t Validate are so illuminating!

1 Like