When should I use single case constructors or type aliases?

unfode · June 19, 2024, 3:23am

Examples of “single case constructor”:

type id = Id of int

type person = Person of {
  name: string;
  age: int
}

Alternatively, I can use type aliases:

type id = int

type person = {
  name: string;
  age: int
}

When to use which?

yawaramin · June 19, 2024, 6:04am

In both the examples, you are not using the id type after defining it. Did you mean to use it as a member of the record types?

nojb · June 19, 2024, 7:32am

Note that person is not typically considered a type alias; since it actually defines a new type (there is no way to use a record type without defininig it previously).

The first one, id, is indeed a type alias because t and int can be used intercheangeably in any context. Thus, your question makes sense for it. Using a one constructor sum type for it helps you avoid accidental confusion between t and an unrelated int in your program.

Cheers,
Nicolas

raphael-proust · June 19, 2024, 7:45am

It’s used in the hex library so that you can keep track of which string is the raw array of bytes and which is the hex-encoded one:

Sometimes you can use a private type alias (instead of a one-constructor sum) for a similar purpose.

The other use for it that I know is when you need to rely on physical equality to distinguish values.

let x = 5
let y = 5
let () = if x == y then print_endline "eq" else print_endline "neq"

type id = Id of int
let a = Id 5
let b = Id 5
let () = if a == b then print_endline "eq" else print_endline "neq"

This prints eq and then neq.

The reason is that Id 5 allocates a new block (it’s a constructor so it constructs!!) whereas 5 is just a literal for an immediate value.

vlaviron · June 19, 2024, 7:54am

I would be surprised if it does what you say outside the toplevel. Sharing immutable constants that are structurally equal is something that even the bytecode compiler does, so I’m expecting eq and eq except in bytecode mode with debugging on.

raphael-proust · June 19, 2024, 8:19am

Indeed! I had tested it by calling ocaml with a file having the content above, but compiling and then running prints eq and eq.

You can wrap the construction inside some complicated enough code that the compiler won’t be able to share these constants.

let b = ref true
let f v = if !b then v else assert false
let x = f 5
let y = f 5
let () = if x == y then print_endline "eq" else print_endline "neq"

type id = Id of int
let g v = if !b then Id v else assert false
let a = g 5
let b = g 5
let () = if a == b then print_endline "eq" else print_endline "neq"

vlaviron · June 19, 2024, 8:37am

The compiler is very clever and can see through your tricks

$ ocamlopt -config-var flambda
true
$ ocamlopt -O3 test.ml -o test && ./test
eq
eq

dbuenzli · June 19, 2024, 8:46am

What about making it blind with:

let g v = Sys.opaque_identity (Id v)

vlaviron · June 19, 2024, 9:19am

You would likely prefer Id (Sys.opaque_identity v).

Your suggestion would prevent the compiler from evaluating a == b to true at compile time, but it would still evaluate to true at runtime.
The reason is that Sys.opaque_identity acts as a barrier between its evaluated argument and the rest of the world. It doesn’t prevent any optimisation from occurring on its own argument. So after inlining you would get:

let a = let v = 5 in Sys.opaque_identity (Id v)
let b = let v = 5 in Sys.opaque_identity (Id v)

Which is equivalent to:

let v_a = 5
let arg_a = Id v_a
let a = Sys.opaque_identity arg_a
let v_b = 5
let arg_b = Id v_b
let a = Sys.opaque_identity arg_b

This gets optimised into:

let shared_const = Id 5
let a = Sys.opaque_identity shared_cont
let b = Sys.opaque_identity shared_cont

raphael-proust · June 19, 2024, 12:14pm

So I guess the advice is not
~~use constructors to make values physically different~~
but it should be

avoid constructors (or use [@unbox]) so the physical equality doesn’t depend on compiler optimisations

JohnJ · June 19, 2024, 1:16pm

There are some situations where aliases are not sufficient, and so you must define a new type. The following is impossible (without the -rectypes option):

type 'a infinite_list = 'a * 'a infinite_list
(* Error: The type abbreviation infinite_list is cyclic *)

But this is accepted:

type 'a infinite_list = L of 'a * 'a infinite_list

unfode · June 19, 2024, 1:24pm

id and person are two examples of “single case constructor” vs type alias.

vlaviron · June 19, 2024, 1:47pm

The person case is not considered a type alias: records in OCaml are nominative, meaning that two records definitions with the same fields are not compatible:

type t1 = { x : int; y : int }
type t2 = { x : int; y : int }
let v1 : t1 = { x = 0; y = 1 }
let v2 : t2 = { x = 0; y = 1 }
if v1 = v2 (* <- type error *)
then ...

On the other hand, tuples are structural: int * int is the same type everywhere.

type u1 = int * int
type u2 = int * int
let v1 : u1 = (0, 1)
let v2 : u2 = (0, 1)
if v1 = v2 (* no error *)
then ...

That’s why t1 and t2 (and both definitions of person in your original examples) are not considered aliases, while u1, u2 (and the second definition of id) are considered aliases.

So my answer to your original question is to use single case constructors when you want to create distinguished types, noting that records already perform this so adding an extra constructor only adds syntactic burden in that case.
In non-record cases the extra constructor can have a small runtime cost, that you can get rid of using the [@@unboxed] attribute if performance is a concern.

yawaramin · June 19, 2024, 2:45pm

Oh I see. Right. So in OCaml ‘traditionally’ the approach has been to use modules. You can see a bigger explanation here: Files, Modules, and Programs - Real World OCaml

Code snippet from there:

module type ID = sig
  type t

  val of_string : string -> t
  val to_string : t -> string
  val ( = ) : t -> t -> bool
end

module String_id = struct
  type t = string

  let of_string x = x
  let to_string x = x
  let ( = ) = String.equal
end

module Username : ID = String_id
module Hostname : ID = String_id

type session_info = {
  user : Username.t;
  host : Hostname.t;
  when_started : Unix.tm;
}

This allows you to ‘spin up’ as many ‘string ID’ types as you need. You can also create an equivalent ‘int ID’ module:

module Int_id = struct
  type t = int

  let of_string = int_of_string
  let to_string = string_of_int
  let ( = ) = Int.equal
end

And of course any other type you want that can support these operations. The advantage is that the actual modules you create to represent the domain IDs (Hostname, Username) come with built-in support operations.

chshersh · June 20, 2024, 10:41am

This is often known as The newtype pattern where you create a completely new nominal type which is just a wrapper around an existing type.

It’s an amazing FP pattern! But if abused too frivolously it can become an obstacle.

There’s an excellent blog post on the subject that gives you an idea of when to use and when not to use it (it’s in Haskell but the main sentiment applies to OCaml as well).

Names are not type safety

paurkedal · June 20, 2024, 6:28pm

I would usually prefer an abstract type, but the singleton variants can be useful when pattern-matching GADTs. Consider:

type id = Id of int

type _ value =
  | Bool : bool -> bool value
  | Int : int -> int value
  | Obj : id -> id value

An x : int value can here be pattern-matched with a single case Int. If type id = int, then the Obj case needs to be matched and rejected, and if the type id was abstract, this would also be the case when matching an x : bool value.

JohnJ · June 20, 2024, 7:35pm

You can achieve the same thing without needing to wrap int in the variant. Since the types on either side of each -> don’t need to be the same, the following code works the same way:

type id = private Id
type _ value =
  | Bool : bool -> bool value
  | Int : int -> int value
  | Obj : int -> id value
    (* or an abstract type instead of int *)

You could also use a polyvar like Obj: int -> [`Id] value to avoid defining a new type altogether.

paurkedal · June 20, 2024, 9:14pm

Yes, that can be an option, though when the parameter is used to constrain a universal type variable which occurs elsewhere in a signature, one might want it to carry a value.

Good point with the private type in either case, I think that’s as close as we can get to an abstract type.

holmdunc · June 21, 2024, 6:06pm

The newtype pattern has gone mainstream because you can even use it in Python now!

from typing import NewType

Id = NewType('Id', int)

def f(x: Id):
    ...

f(123) # Static type-checker error

unfode · June 22, 2024, 3:45am

Thank you for recommending the blog post by Alexis King. Names are not type safety and Parse, Don’t Validate are so illuminating!

Topic		Replies	Views
Sum type constructor declaration subtlety (and documentation wanted!) Ecosystem	25	3058	May 8, 2018
A single GADT constructor accepting some specific type parameters? Learning gadt	6	1261	April 6, 2018
Variant type constructors or currying Learning	2	1072	June 22, 2018
Higher kinded polymorphism Learning	13	6674	July 1, 2018
Type equalities, extending to type constructors Learning	3	601	October 11, 2019

When should I use single case constructors or type aliases?

Related topics