Help me to understand functors

functor

#1

Hello,

I’m on the way of learning Ocaml, and one of the things I’m most interested about are Functors. I am reading the section on realWorldOcaml about functors, and the first part is quite clear to me. The problems start to appear when you try restrict bypassing the create function. It starts creating an internal endpoint type, which is just the same t type of the Endpoint type it accepts. How does this avoid the external code bypassing the create function ? On the declared interface the internal endpoint type is not used at any point, it is just part of the returned interface. All the internal functions uses the provided Endpoint.t type:

module Int_interval = Make_interval(Int);;
module Int_interval :
  sig
    type t = Make_interval(Base__Int).t
    type endpoint = Make_interval(Base__Int).endpoint
    val create : endpoint -> endpoint -> t
    val is_empty : t -> bool
    val contains : t -> endpoint -> bool
    val intersect : t -> t -> t
  end

See that ? type t uses the provided t type, the only function that uses the endpoint type is create, and that is just until you specify that endpoint is, in fact, the same Endpoint.t type. At the end you have 3 types (internal t, Endpoint.t and endpoint) which are the same, but this seems to magically avoid you to manually create an Interval type.

module type Int_interval_intf =
  Interval_intf with type endpoint := int;;
module type Int_interval_intf =
  sig
    type t
    val create : int -> int -> t
    val is_empty : t -> bool
    val contains : t -> int -> bool
    val intersect : t -> t -> t
  end

And, in fact the internal endpoint type is never used on the Functor declaration.
How does this work ?

Thanks


#2

It would be good for the discussion to show the actual functor Make_interval and not just the result from using it. An easier to understand example for a functor is Map.Make from the OCaml standard library.


#3

Hello @lindig, sorry about that.
I just supposed that, being RealWorld Ocaml the only book of reference and having it publicly available everybody would know about it’s examples. Now I see that was a stupid assumption.
Here is the interface they declare:

module type Interval_intf = sig
  type t
  type endpoint
  val create : endpoint -> endpoint -> t
  val is_empty : t -> bool
  val contains : t -> endpoint -> bool
  val intersect : t -> t -> t
end;;

Here is the Functor implementation

module Make_interval(Endpoint : Comparable)
  : Interval_intf with type endpoint := Endpoint.t =
struct

  type t = | Interval of Endpoint.t * Endpoint.t
           | Empty

  (** [create low high] creates a new interval from [low] to
      [high].  If [low > high], then the interval is empty *)
  let create low high =
    if Endpoint.compare low high > 0 then Empty
    else Interval (low,high)

  (** Returns true iff the interval is empty *)
  let is_empty = function
    | Empty -> true
    | Interval _ -> false

  (** [contains t x] returns true iff [x] is contained in the
      interval [t] *)
  let contains t x =
    match t with
    | Empty -> false
    | Interval (l,h) ->
      Endpoint.compare x l >= 0 && Endpoint.compare x h <= 0

  (** [intersect t1 t2] returns the intersection of the two input
      intervals *)
  let intersect t1 t2 =
    let min x y = if Endpoint.compare x y <= 0 then x else y in
    let max x y = if Endpoint.compare x y >= 0 then x else y in
    match t1,t2 with
    | Empty, _ | _, Empty -> Empty
    | Interval (l1,h1), Interval (l2,h2) ->
      create (max l1 l2) (min h1 h2)

end ;;

Regards


#4

There is a difference between M with type endpoint = .. and M with type endpoint := ... The latter basically eliminates type endpoint from the result:

I would suggest to start with a simpler example to explore this.

module type KEY = sig
  type t
  val compare : t -> t -> int
end

module type MAP = sig
  type key
  type t

  val empty: t
  val add: key -> int -> t -> t
  val lookup: key -> t -> int
end

module Make (Key: KEY ) : MAP with type key = Key.t = struct 
  type key = Key.t
  type t   = unit

  let empty = ()
  let add key x t = ()
  let lookup key t = 0
end


#5

Thanks @lindig, your example is simple but complete.
However, my big concern is about how declaring an internal type that is just an alias to the provided type allows the compiler restrict using the internal types and forces you go through the create function.
I understand how the := destructive substitution works, the ting is unclear is that (intentional) limitation.

Regards


#6

“Real World OCaml” is a great book, but is not the language manual (which exists and is quite readable in most places) and is not even the only book explaining the language.


#7

Is your problem the following? When implementing functor Make, you want to take advantage of knowing the type defined in the functor argument (Key.t). For example, if you know that Key.t is int, you would like to use that knowledge to create values directly.

If indeed this is your problem, there isn’t a solution, I’m afraid. In the general case you can create values of type Key.t only by the methods provided by Key. You can’t specialise Make for the case that Key.t is int. The code generated by the compiler relies on the fact that you can’t know this. The compiler generates code for Make only once and not for every instance where you supply a functor argument.


#8

Is the part that confuses you the abstract type? That’s a feature of the module system and it’s not limited to functors.

For example, you might have an id type that is useful for tracking the objects that your module deals with, but you don’t want users of your module to use it as a normal int by performing arithmetic operations on it, or using an invalid id (such as -1) with the functions provided by your module, so you can leave its implementation abstract in the module’s interface:

(* foo.ml *)
type id = int
let ids : int ref = ref 0
let create () =
  incr ids;
  !ids

(* foo.mli *)
type id
val create : unit -> id

This way, inside foo.ml the type id will just be an alias to int, but outside it will be an opaque object that can only be constructed and manipulated with the functions inside Foo.


#9

Yes exactly. It’s that. Thank you for your clarification. I’m still unsure how, without using the abstract type at all, you get the opaque benefit. On your example you are declared everything as just Int s.
I saw a related answer on another suggested post Curious about needs for private record fields
Seems they are also called private fields. How is this forum system so smart?


#10

The compiler just figures that on its own if you declare it properly in the module’s signature, as I did above. I’m not familiar with the theory behind it so I couldn’t tell you how exactly.

Private record fields do not exist in OCaml, the author of that thread was suggesting an extension to the language.


#11

Exactly, and we can say this is the only book of reference. :slight_smile: The answers for this thread are given here:

By the way, I’ll try to explain how modules and functors work with a simple example.

Modules.

Imagine that you want to define the notion of set of int with one value and two functions on it:

  • the empty set
  • is_element to test if an int belongs to a given set
  • add to add an int in a set (but only if it is not already present).

We can represent a set of int as a list of int, so we define this module:

module Int_Set = struct
  type elt = int (* an alias to `int` for the elements of the set *)
  type t = elt list (* an alias for the type of sets *)

  let (empty : t) = []

  let rec is_element i (set : t) =
    match set with
    | [] -> false
    | x :: xs -> (x = i) || is_element i xs

  let add i set = if is_element i set then set else i :: set
end

We can play around a bit with it:

let s = Int_Set.(add 1 (add 2 empty));;
val s : Int_Set.t = [1; 2]

Int_Set.is_element 2 s;;
- : bool = true

Int_Set.is_element 3 s;;
- : bool = false

We can also add elements and verify that it satisfies the invariant that an element already present isn’t added again:

Int_Set.add 3 s;;
- : Int_Set.t = [3; 1; 2]

Int_Set.add 2 s;; (* the invariant is satisfied *)
- : Int_Set.t = [1; 2]

But there is a problem: since the concrete representation for the type of sets is known (they are just lists), we can break the invariant.

Int_Set.add 3 (2 :: s);;
- : Int_Set.t = [3; 2; 1; 2]

The solution is to use what is called abstract type.

Signatures.

When you define a module, the compiler infer automatically its type or its signature. For the module defined above, the toplevel returns:

module Int_Set :
  sig
    type elt = int
    type t = elt list
    val empty : t
    val is_element : elt -> t -> bool
    val add : elt -> t -> t
  end

What we can do is to define a less precise signature to hide the fact that sets are implemented using lists:

module type S = sig
  type elt = int
  type t
  val empty : t
  val is_element : elt -> t -> bool
  val add : elt -> t -> t
end

Here, since we want to add ints in our sets, we don’t hide the fact that elements are int.

We can now restrict the interface of our module and the invariant can’t be broken anymore:

module Abstract_Int_Set = (Int_Set : S)

let s1 = Abstract_Int_Set.(add 1 (add 2 empty));;
val s1 : Abstract_Int_Set.t = <abstr>

1 :: s1;;
Error: This expression has type Abstract_Int_Set.t
       but an expression was expected of type int list

Ok, that’s fine: we have a simple notion of sets of int and we can preserve some invariant, but what to do if we want to generalize this notion to have sets over other types without repeat ourself? That’s where functors come into play.

Functors.

In the same way that functions are used to construct new values from given one, functors are used to construct modules from given one.

First, as in our int case, to define sets for a given type we need to know when two values are equal. So we define this signature:

module type EQ = sig
  type t
  val eq : t -> t -> bool
end

Then we define the general signature for a set module:

module type SET = sig
  type elt
  type t
  val empty : t
  val is_element : elt -> t -> bool
  val add : elt -> t -> t
end

Finally we write code to explain how to construct a module of set over a type when we have a function to test equality between its values:

module Make_Set (Elt : EQ) : SET with type elt = Elt.t = struct
  type elt = Elt.t
  type t = elt list

  let empty = []

  let rec is_element i set =
    match set with
    | [] -> false
    | x :: xs -> (Elt.eq x i) || is_element i xs

  let add i set = if is_element i set then set else i :: set
end

The code is mostly the same as before, except in the function is_element where to test for equality we use the function Elt.eq given by the parameter of the functor.

Now, with this generic way to construct module of sets, we can easily redifine our previous one:

module Abstract_Int_Set = Make_Set (struct type t = int let eq = (=) end)

let s2 = Abstract_Int_Set.(add 1 (add 2 empty));;
val s2 : Abstract_Int_Set.t = <abstr>

Abstract_Int_Set.is_element 2 s2;;
- : bool = true

Abstract_Int_Set.is_element 3 s2;;
- : bool = false

But we can also use it to have sets of string:

module String_Set = Make_Set (struct type t = string let eq = (=) end)

let s = String_Set.(add "hello" (add "world" empty));;
val s : String_Set.t = <abstr>

String_Set.is_element "hello" s;;
- : bool = true

String_Set.is_element "foo" s;;
- : bool = false

I hope that you can now see what are functors, how to define them and how to use them.


#12

Thank you very much @kantian !! Your explanation was awesome
It’s amazing how putting the same information on different words could make someone (in tis case me) understand something.

On the first part (before functors) it’s a bit awkward to me that you Just define the module, the signature and then you build the new abstract module by just saying that the first module has te type previously defined:

module type Int_Set = struct ...
module type S = sig ...
module Abstract_Int_Set = (Int_Set : S)

I find the functor way of doing it much more intuitive, because is on the definition of the functor where you couple it with the signature. But I think it is just a matter of familiarity.

Many many thanks


#13

Surely, that’s awkward, but in pratice you’ll not do this way: instead you’ll constraint the signature of your module at the definition site (or in a .mli file if you module is a compilation unit, as in @steinuil example).

In my comment this is just a consequence of choices I made for my presentation: I wanted to introduce the notion of module and signature successively to illustrate the problem of maintaining invariant and the utility of abstract type. But in practice, you’ll write something like that:

module Int_Set : sig
  type elt = int
  type t
  val empty : t
  val is_element : elt -> t -> bool
  val add : elt -> t -> t
end = struct
  (* your code *)
end

By the way, now that you understand these notions, we can look back to this example. From the simple specification of our problem:

an experimented OCaml programmer formalize these constraints with the general and abstract type signature for set:

module type SET = sig
  type elt
  type t
  val empty : t
  val is_element : elt -> t -> bool
  val add : elt -> t -> t
end

which is just a translation in OCaml of the problem. More precisely, since we want set of ints what we want is a module with this signature:

module type S = module type SET with type elt = int

and I proposed a simple implementation where the type t is equal to int list (this is not very efficient, binary search tree is a better solution). Then I generalized the procedure, using functors, to any type with an equality function (with binary search tree we will generalize to any type with a comparison function, as it’s done in the standard library).

We can now go to your original example (the one from Real World OCaml) with interval. The general signature of interval over endpoint type is:

module type INTERVAL = sig
  type endpoint (* the type for elements of the interval *)
  type t (* the type of the interval *)
  val create : endpoint -> endpoint -> t
  val is_empty : t -> bool
  val contains : t -> endpoint -> bool
  val intersect : t -> t -> t
end

and at the end, they define a functor to construct such a module from a type with a comparison function where the type t of interval is kept abstract for invariant preservation reason.

I guess you can now read this chapter again and understand what they’ve done.


#14

Again, @kantian, thank you very much. I cant’ ask for more, the concept is clear and your explanations are superb.
Now the only thing left is getting familiar with the concept and try to use it at some places.

Thank you very much.


#15

@kantian Would you be willing to have your explanation copied to ocamlverse? (I have to ask because ocamlverse is CC Zero licensed and you need to agree to that.)


#16

Sure, I agree. You can do what you want with my comments.


#17

@kantian: see: https://github.com/OCamlverse/ocamlverse.github.io/blob/master/content/functors.md

I lightly edited the prose to fix a few grammar problems and make a couple of things a bit more idiomatic.

If you want to add anything or edit it, let me know or submit a pull request!


#18

That’s fine, I just have few remarks:

  • maybe can we remove the dot at then end of the title of each subpart
  • there is a typo “Then then we define the general signature…”
  • there is no syntactic coloration for this particular signature (don’t know why)
  • you add an “a” in the phrase “*construct new values from a given one”, in this example there is only one parameter but functors may have multiple parameters (even if, indeed, strictly speaking a function has only one parameter)
  • another typo “we can easily redifine redefine our previous module”

#19

That’s fine, I just have few remarks:

  • maybe can we remove the dot at then end of the title of each
    subpart

Done.

  • there is a typo “Then then we define the general
    signature…”

Fixed.

  • there is no syntactic coloration for this particular signature
    (don’t know why)

Because I failed to put in the indicator that this was ocaml code in
the github flavored markdown. Fixed.

  • you add an “a” in the phrase “*construct new values from a
    given one”, in this example there is only one parameter but
    functors may have multiple parameters (even if, indeed, strictly
    speaking a function has only one parameter)

I’ve changed it a bit. Tell me what you think of the new one.

  • another typo “we can easily redifine redefine our previous
    module”

Fixed.

Let me know what you think of the fixes.


#20

In the set implementation, if x = i then true else is_element i xs is equivalent to the simpler x=i || is_element i xs.