The shape design problem

I’ll be glad if someone can explain me what is intelligible in my solution. To me, it seems the most natural and clearest way to solve his problem.

Can someone explain me how programmers used to define their concepts? To me, that’s how I proceed.

In our case we don’t have a birch, a lime and a oak, but a point and a rectangle and abstracting the way they differ we want to define the notion of a shape.

Using a sum type seems clearly strange and clumsy to me : do we really define a tree in enumerating all the different species : birch, lime, oak… And each time we find a new tree, we add it to this enumeration? No, we compare them and try to find what they have in common.

In our case, the most plausible situation, as describe in the original message, is that we first have three compilations unit Point, Rectangle and Circle. That’s how the message begins:

and what they have in common in their behavior:

The question in then as follow:

At this point, we don’t still have the notion of shape (it should be find), and the most natural way to write this in OCaml is doing so:

(* I use toplevel module, but in practice in will be compilation unit *)
module Point : sig
  type t
  val make : int -> int -> t
  val area : t -> float
  val draw : t -> unit
end = struct
  type t = {x : int; y : int}

  let make x y = {x; y}
  let area {x; y} = 0.0
  let draw {x; y} = ()
end

module Rectangle : sig
  type t
  val make : Point.t -> Point.t -> t
  val area : t -> float
  val draw : t -> unit
end = struct
  type t = {bottom_left : Point.t; top_right : Point.t}

  let make p1 p2 = {bottom_left = p1; top_right = p2}
  let area _ = 3.0
  let draw _ = ()
end

module Circle : sig
  type t
  val make : Point.t -> float -> t
  val area : t -> float
  val draw : t -> unit
end = struct
  type t = { center : Point.t; radius : float}

  let make center radius = {center; radius}
  let area {center; radius} = Float.pi *. 2.0 *. radius
  let draw _ = ()
end

So, here is our starting point. And now, I ask my self what is the common part between these three modules: they all satisfy these following interface

module type SHAPE = struct
  type t
  val area : t -> float
  val draw : t -> unit
end

That’s why, at this stage, I use first-class module to define a shape as something that satisfy this interface, in the same way that Jane Street does in their libraries.

But then, there is this constraint:

So I need to define a type that contains all the possible shapes, abstracting away the underlying type of this particular one. That’s exactly what existential types are aimed for.

When I write:

type shape = Shape : ('a * (module Shape with type t = 'a) )-> shape

that really means a value is a shape if it can be drawn and has an area. In other words, if there exists functions area and draw on it.

When you have two incompatible types, say int and float, the way to put them in a unique one is to use a sum type:

type int_or_float = Int of int | Float of float

and you use the Int and Float constructor to construct such a value.

But when you have an infinity of possible incompatible types, in OCaml the most natural way to define this (infinite) sum is to use the existential GADT I defined. And then you will use this constructor to construct a shape value.

val to_shape : (module Shape with type t = 'a) -> 'a -> shape

What is really not intelligible in this code?

2 Likes

For my own part (I cannot speak for others), if you reread my post I think you will see that my posting to which you refer was not commenting on the way you propose to set out the overall structure of your code in order to provide a solution to the OP’s posting.

I was commenting on whether, given that structure (involving the combining of data with a first class module containing the functions to operate on the data), this is more intelligibly done using an existentially quantified type via a GADT in the style you have shown (and which was new to me) on the one hand, or using a further wrapper module in the style shown in Real World Ocaml on the other hand, given that the effect of either mechanism is the same. I expect they have roughly equivalent efficiency also.

On the wider structural question you raise, with the simple case originally put by the OP, I would probably do neither, and instead just use variants holding records as a payload. If an explicit supertype/subtype relationship was wanted with respect to the shapes (and it may not be) I would use polymorphic variants with the records.

1 Like

I forgot a negation in my message, I wanted to say not intelligible about my solution. But if I understand correctly what you say here:

you don’t find my solution not intelligible. I believed that you find the version in Real World OCaml (without existential) more intelligible than mine, and so that mine was not intelligible.

Just a problem of misunderstanding from me, sorry.

I think the issue is that, although I understand the principles of GADT’s (attaching a secondary type to a variant case which can be used for type deduction purposes) and I can use them successfully, I find the inner working of the GADT’s type system quite mysterious, particularly in the behavior of existentially quantified type variables provided in the constructor of a GADT. What the existentially quantified type is doing in your code is informing the compiler that the type of t in the first class module concerned is the same as the data type (which is the existential type). Fair enough, but there are other ways to do it.

This reminds me of template metaprogramming in C++ which is OK in moderation but used to excess can make code unintelligible. So I do find the approach in Real World Ocaml easier to grasp.

Ok, I see. For me existential are not difficult to grasp, and I find it’s just a trivial use case of GADT (the simplest one).

But, in the OP case, if you just use variants holdings records, how do you add a new shape? All the solution proposed to solve this question, except mine, are absolutely anti-modular by design: the complexity of the shape components is dispatched on all the compilation units. With mine you can remove the shape.ml compilation unit, that won’t change anything for the Point, Rectangle and Circle ones. You just have to add this file to the existing code base (the one with the simplest version for these modules) and there is a new shape component in the system. Among all the proposed solutions, I don’t see any other one with this property.

For programming in small, I will use plain old algebraic data types. For public libraries and large programs designed for change, I will use the dependency inversion principle and make sure that my high-level policy code (e.g., drawing facilities) doesn’t depend on the low-level implementation details.

Large vs. Small

First of all, let’s define what is programming in large and what is in small. The truth is that there is no definite answer, you have to develop some taste to understand where you should just stick with plain ADT (PADT) or where to unpack the heavy artillery of GADT and final tagless styles. There is, however, a rule of thumb that I have developed and which might be useful to you as well.

ADT is the detail of implementation

ADT shall never go into the public interface. Both PADT and GADT are details of implementation and should be hidden inside the module and never reach the mli file, at least the mli of a publically available code (i.e., a library that you plan to distribute and for which you install cmi/mli files). Exposing your data definitions is much like showing your public member values in C++ or Java.

The less the extent of an ADT in your codebase the easier it will be to maintain it. Indeed, notice even how hard and ugly it is to work in OCaml with ADT defined in other modules. So even the language resists this. And if the language resists then don’t force it.

With that said, 90% of your code should be small code with data types defined for the extent of a compilation unit, which, in general, should be less than 300-500 lines of code (ideally start with 100) and have a couple of internal modules. Basically, a compilation unit should have a size that easily fits into short-term memory.

Since programming in small is a more or less clear concept, let’s move forward.

Programming In Large

So you’re designing a large project, possibly with a public API, that will help lots of developers with different backgrounds and skills. And as always it should be delivered next week, well at least the minimal viable product. The upper management is stressing you but you still don’t want to mess things up and be cursed by the generations of software developers that will use your product.

One of the killer features of OCaml, and the main reason why I have switched to it a long time ago and use it since then, is that it ideally fits the above-described use case. It fully supports the programming in large style (like Java and Ada) but enables at the same easy prototyping and delivering MVPs the next day your manager asks (like Python). And if you have young students in your team, their enthusiasm will not leave the boundaries of the module abstraction. And if you are the manager you have excellent language (the language of mli files) that you can use to convey your design ideas to other team members and even to non-technical personnel.

So let’s do some prototyping and design, using the language of signatures. First of all, we shall decide which type should be designed for change and which should be regular data types. E.g., we probably don’t want to have a string data type with pluggable implementation. But we definitely know that our figures will change and more will be added, possibly by 3rd-party developers.

We finally decide that Point and Canvas will be the regular types (for Canvas we might regret our design decision). For prototyping, we will write the module type of our project. At the same time, we should also write documentation for each module and its items (functions, types, classes, etc). Writing documentation is an important design procedure. If you can’t describe in plain English what a module is doing then probably it is a wrong abstraction. But now, for the sake of brevity and lack of time we will skip this important step and unroll the whole design right away.

module type Project = sig

  module Point : sig
    type t
    val create : int -> int -> t
    val x : t -> int
    val y : t -> int
    val show : t -> string
  end

  module Canvas : sig
    type t
    val empty : t
    val text : t -> Point.t -> string -> unit
  end

  module Widget : sig
    type t
    val create : ('a -> Canvas.t -> unit) -> 'a -> t
    val draw : t -> Canvas.t -> unit
  end

  module type Figure = sig
    type t
    val widget : t -> Widget.t
  end

  module Rectangle : sig
    include Figure
    val create : Point.t -> Point.t -> t
  end

  module Circle : sig
    include Figure
    val create : Point.t -> int -> t
  end
end

Let’s discuss it a bit. The Canvas and Point types are pretty obvious, in fact this design just assumes that they are already provided by the third-party libraries, so that we can focus on our figures.

Now the Widget types. Following the dependency inversion principle, we decided to make our rendering layer independent of the particular implementation of the things that will populate it. Therefore we created an abstraction of a drawable entity with a rather weak definition of the abstraction, i.e., a drawable is anything that implements ('a -> Canvas.t -> unit) method. We will later muse on how we will extend this abstraction without breaking good relationships with colleagues.

Another point of view on Widget is that it defines a Drawable type class and that Widget.create is defining an instance of that class, so we could even choose a different naming for that, e.g.,

module Draw : sig 
  type t
  val instance : ('a -> Canvas.t -> unit) -> 'a -> t
  val render : t -> Canvas.t -> unit
end

The particular choice depends on the mindset of your team, but I bet that the Widget abstraction would fit better more people.

But let’s go back to our figures. So far we only decided that a figure is any type t that defines val widget : t -> Widget.t. We can view this from the type classes standpoint as that the figure class is an instance of the widget class. Or we can invoke Curry-Howard isomorphism and notice that val widget : t -> Widget.t is a theorem that states that every figure is a widget.

The rest should be trivial with these signatures, so let’s move forward and develop MVP,

module Prototype : Project = struct

  module Point = struct
    type t = {x : int; y : int}
    let create x y = {x; y}
    let x {x} = x
    let y {y} = y
    let show {x; y} = "(" ^ string_of_int x ^ ", " ^ string_of_int y ^ ")"
  end

  module Canvas = struct
    type t = unit
    let empty = ()
    let text _canvas _position = print_endline
  end

  module Widget = struct
    type t = Widget : {
        draw : 'obj -> Canvas.t -> unit;
        self : 'obj;
      } -> t

    let create draw self = Widget {self; draw}
    let draw (Widget {draw; self}) canvas = draw self canvas
  end

  module type Figure = sig
    type t
    val widget : t -> Widget.t
  end

  module Rectangle = struct
    type t = {ll : Point.t; ur : Point.t}
    let create ll ur = {ll; ur}
    let widget = Widget.create @@ fun {ll} canvas ->
      Canvas.text canvas ll "rectangle"
  end

  module Circle = struct
    type t = {p : Point.t; r : int}
    let create p r = {p; r}
    let widget = Widget.create @@ fun {p} canvas ->
        Canvas.text canvas p "circle"
  end
end

Et voila, we can show it to our boss and have some coffee. But let’s look into the implementation details to learn some new tricks. We decided to encode widget as an existential data type, no surprises here as abstract types have existential type and our widget is an abstract type. What is existential you might ask (even after reading the paper), well in OCaml it is a GADT that captures one or more type variables, e.g., here we have 'obj type variable that is not bound (quantified) on the type level, but is left hidden inside the type. You can think of existential as closures on the type level.

This approach enables us to have widgets of any types and, moreover, develop widgets totally independently and even load them as plugins without having to recompile our main project.

Of course, using GADT as encoding for abstract type is not the only choice. It even has its drawbacks, like we can’t serialize/deserialize them directly (though probably we shouldn’t) it has some small overhead.

There are other options that are feasible. For example, for a widget type, it is quite logical to stick with the featherweight design pattern and represent it as an integer, and store the table of methods in the external hash table.

We might also need more than one method to implement the widget class, which we can pack as modules and store in the existential or in an external hash table. Using module types enables us gradual upgrade of our interfaces and build hierarchies of widgets if we need.

When we will develop more abstract types (type classes), like the Geometry class that will calculate area and bounding rectangles for our figures, we might notice some commonalities in the implementation. We might even choose to implement dynamic typing so that we can have the common representation for all abstract types and even type casting operators, e.g., this is where we might end up several years later.

module Widget : sig
  type 'cls t

  module type S = sig
    type t
    ...
  end

  type 'a widget = (module S with type t = 'a)

  val create : 'a widget -> 'a -> 'a cls t

  val forget : 'a t -> unit t
  val refine : 'a Class.t -> unit t -> 'a cls t option
  val figure : 'a t -> 'a
end

We’re now using a module type to define the abstraction of widget, we probably even have a full hierarchy of module types to give the widget implementors more freedom and to preserve backward compatibility and good relationships. We also keep the original type in the widget so that we can recover it back using the figure function. yes, we resisted this design decision, because it is in fact downcasting, but our clients insisted on it. And yes, we implemented dynamic types, so that we can upcast all widgets to the base class unit Widget.t using forget, but we can still recover the original type (downcast) with refine, which is, obviously, a non-total function.

In BAP, we ended up having all these features as we represent complex data types (machine instructions and expressions). We represent instructions as lightweight integers with all related information stored in the knowledge base. We use dynamic typing together with final tagless style to build our programs and ensure their well-formedness, and we use our typeclass approach a lot, to enable serialization, inspection, and ordering (we use domains for all our data type). You can read more about our knowledge base and even peek into the implementation of it. And we have a large library of signatures that define our abstract types, such as semantics, values, and programs. In the end, our design allows us to extend our abstractions without breaking backward compatibility and to add new operations or new representations without even having to rebuild the library or the main executable. But this is a completely different story that doesn’t really fit into this posting.

Conclusions

We can easily see that our design makes it easy to add new behaviors and even extend the existing one. It also provisions for DRY as we can write generic algorithms for widgets that are totally independent of the underlying implementation. We have a place to grow and an option to completely overhaul our inner representation without breaking any existing code. For example, we can switch from a fat GADT representation of a widget to a featherweight pattern and nobody will notice anything, except (hopefully) improved performance. With that said, I have to conclude as it already took too much time. I am ready for the questions if you have any.

14 Likes

No question, just want to thank you for this fully detailed explanation that seems clear to me, especially the type directed design and its evolution over time.

That’s exactly what I had in mind when I designed the to_shape constructor family:

val to_shape : (module Shape with type t = 'a) -> 'a -> shape

It’s a theorem that states that any member of a Shape type classes is a shape (that’s indeed the mere definition of a shape).

The only thing I wanted to add is that you’re starting to play with fire when you’re ended there :slight_smile:

1 Like

Well, three years later, I can only be happy that we made this decision (though I was accused for turning OCaml into Python). But the truth is that you have a rather unpleasant choice, either keep your AST untyped and always fear that is won’t be well-formed or add weak typing to it. It has to be weak because you will still need to unparse it from the untyped representation so you had to reify the typing into the language. However, after we recovered the type we might find ourselves in need to forget it. Yes, it is nice to have i32 bitv value and i8 bitv value so that the typechecker will prevent us from adding them together without proper casting (well, and the casting itself, requires casting on the host language), but we would also like to store all these values in a mapping or a set. Or, again, to serialize them. It is also important to notice, that our weak types are witnessed by concrete representations of sorts so you can’t cast an 8-bit value into 32-bit value.

1 Like

Just by curiosity, how do you implement dynamic typing: with type trickery like this?

type _ dyn_typ = ..

type _ dyn_typ +=
  | Int : int dyn_typ
  | Float : float dyn_typ
  | List : 'a dyn_typ -> 'a list dyn_typ

(* dynamically typed languages have only one type, namely top *)
type top = Any : 'a dyn_typ * 'a -> top

let typeof (Any (typ, _)) =
  let rec aux : type a. a dyn_typ -> string = fun typ ->
    match typ with
    | Int -> "int"
    | Float -> "float"
    | List t -> Printf.sprintf "list of %s" (aux t)
    | _ -> assert false
  in aux typ

let any typ v = Any (typ, v)

typeof (any Int 3);;
- : string = "int"

typeof (any (List Int) [1; 3]);;
- : string = "list of int"

Not like this, this won’t work as we need to serialize the typing information and restore it back. And we need to do this across different compilations (compiled with different compiler versions on different architectures). So GADT is a no-no for us.

Before going further, we separate classes and sorts. Where class is a collection of sorts. For example, we have classes such as values, effects, and programs, and each class is further subdivided into a potentially infinite quotient set of sorts, e.g., we have i bitvec values, where i is the type index for the size of the bitvector, which is countably infinite. Correspondingly we have memories that are indexed with two types, e.g., (p bitvec, v bitvec) mem values which is a storage for v bitvec objects indexed with p bitvec addresses. And so on, the sorts could be quite complex, as for example a sort for IEEE754 floating-point values, which has to track the representation, such as the number of bits in the number’s constituents, whether it is binary or decimal and so on. The same is with effects, we distinguish between control effects and jump effects.

The classes are typed statically, and it is very important because the class defines what slots a value can have. Contrary, the sorts are weakly indexed with phantom types. That means that those types exist only in the signatures, and in the implementation, we’re free to mess with them. We represent sorts of values algebraically, as we need an infinite set of sorts and need them to keep the typing information. Therefore here is our representation of a sort,

    type t =
      | Sym of name
      | Int of int
      | App of {args : t list; name : name option}
    [@@deriving bin_io, compare, sexp]
  end

  type +'a exp = Exp.t
  type +'a t = 'a exp
  type top = unit t

And sort equality is just derived from the equality of the sort constructors,

  let refine witness t = match t with
    | App {name=Some name}
    | Sym name when [%compare.equal: name] name witness -> Some t
    | _ -> None

  let forget : 'a t -> unit t = ident

And the names, which are witnesses are private to each module that makes the module that defines the sort to be the only point of failure, which is fully responsibly for the type rules of that sort, as this module and only this can provide the type casting operator, e.g.,

module Bitv : sig
  type 'a t
  val define : int -> 'a t sort
  val refine : unit sort -> 'a t sort option
  val size : 'a t sort -> int
end
1 Like

Yes, using variants to implement the polymorphism to which you refer makes it harder to add new shapes. You have to revise all existing functions, including those in other compilation units, which operate on the set of shapes in question in order to accommodate a new addition to the set. However, variants makes it easy to add new behaviors for any given set of shapes.

For completeness, here’s a simple object implementation:

module Surface = struct
  type t
end

class point x y = 
  object (self)
    val x : int = x
    val y : int = y
    method area : int = 1
    method draw (surface : Surface.t) : unit = ()
  end

class rectangle = object (self)
  val bottom_left : point = new point 1 2
  val top_right : point = new point 3 4
  method area : int = 1
  method draw (surface : Surface.t) : unit = ()
end

let l = [new point 1 2; new rectangle]
let areas = List.map (fun s -> s#area) l

They can be wrapped in modules to hide the details of the type:

module Surface = struct
  type t
end

module type SHAPE = sig
  type c = < area : int; draw : Surface.t -> unit >
  val make : unit -> c
end

module Point : SHAPE = struct
  class c x y = object (self)
    val x : int = x
    val y : int = y
    method area : int = 1
    method draw (surface : Surface.t) : unit = ()
  end
  let make () = new c 1 2
end

module Rectangle : SHAPE = struct
  class c = object (self)
    val bottom_left : Point.c = Point.make ()
    val top_right : Point.c = Point.make ()
    method area : int = 1
    method draw (surface : Surface.t) : unit = ()
  end
  let make () = new c
end

let l = [Point.make (); Rectangle.make ()]
let areas = List.map (fun s -> s#area) l

I’m wondering if you really need classes for the object implementation – you’re not actually inheriting from them? When reading @kantian’s first class module version above I imagined the corresponding object oriented solution would define a class shape with virtual methods area and draw, inherited by point or rectangle methods.
You show that an object type can also be used for specifying the interface of point and rectangle.

How about this variation of your object implementation (Disclaimer: I have not actually used OCaml objects in real life, so this may contain obvious mistakes!)

module Surface = struct type t end
type shape = < area : int; draw : Surface.t -> unit >

module Point = struct
    let make x y : shape = object
        val x = x
        val y = y
        method area = ignore (x,y); 0
        method draw surface = ()
    end
end
module Rectangle = struct
    let make lx ly ux uy : shape = object
        val lx = lx val ly = ly val ux = ux val uy = uy
        method area = (uy - ly) * (ux - lx)
        method draw surface = ()
    end
end 
let areas = List.map (fun s -> s#area) [Point.make 1 2; Rectangle.make 1 2 3 4]

Maybe not, no. Force of habit. :slight_smile:

OCaml uses row polymorphism so you don’t need to use inheritance to implement an interface. If an object has the corresponding type it has the interface. Inheritance in OCaml is used only to inherit implementation and to enable open-recursion.

With that said, OCaml objects actually fit into ADT and are as well a detail of implementation that should ideally be hidden. My implementation above, could be using objects as existential, and this is what we were using before GADT (and even first class modules) came to OCaml. Today, GADT offer a much more efficient and syntactically convenient encoding.

Speaking of OCaml classes and class types, they also leak too much implementation and it is hard to use them in large programs. The only and very rare use case, when you might find classes useful is when you need an open recursion, e.g., when you would like to inherit implementation and override some of the base methods so that the other base methods would call to them. But this feature is rarely needed and is most confusing to support as it creates the tightest possible coupling between the components, which is hard to understand and debug.

1 Like

To be honest, I must confess that I never read this book, I just skim over it. So, I decided to have a look at the chapter about first-class modules, and the solution I proposed is exactly what they did in this chapter : they just use the module type langage instead of the core type language to defined their notions.

So I can answer the previous @Chimrod question about documentation resources: chapter 10 of RWO about first-class modules.

RWO compare to my solution.

These two solutions are absolutely identical, all the types we use are isomorphic, we just not use the same language features to define them.

So let see on the above shape example. We start with this Shape algebra (a module type):

module type Shape = sig
  type t
  val area : t -> float
  val draw : t -> unit
end

After that we use first-class modules to define the type class:

type 'a shape_methods = (module Shape with type t = 'a)

After that, I stick to the core language to define the rest of the types I need

type 'a shape_object = {meth : 'a shape_methods ; this : 'a}
type shape = Shape : 'a shape_object -> shape

What they did in RWO is that they stick to the module type language to define exactly the same notions:

module type Shape_object = sig
  module Meth : Shape
  val this : Meth.t
end

(* this way it's as if they have done as follow *)
type 'a shape_object_rwo = (module Shape_object with type Meth.t = 'a)
type shape_rwo = (module Shape_object)

And the build_instance function they define is exactly the family of type constructors that I named to_shape. That’s it : we’re doing exactly the same thing.

As @ivg said above : abstract types are existential types, so the type shape_rwo as I defined above is an existential type isomorphic to mine. The only difference is that for ergonomic reasons I prefer to stick to the core language as I’m not scared by existential GADT.

1 Like

I have to admit I don’t really understand your widget implementation. What’s the benefit of the GADT here, compared to not using it? Especially in the context of polymorphism (List.map (fun s -> s.area) shapes).

GADT here work the same as non-PODT data structures in C++ or as classes in Java. They encapsulate a state and behavior in one object. E.g.,

type Widget = {
   self : 'a;
   draw : 'a -> Canvas.t -> unit;
}

It is a simple example that has only one method just as a showcase. Here are two more implementations of the Widget interface that are not using GADT

Using objects:

  module Widget = struct
    type t = < draw : Canvas.t -> unit >
  
    let create draw self = object
      method draw canvas = draw self canvas
    end
  
    let draw self canvas = self#draw canvas
  end

as you can see, we eschew the underlying implementation type and capture the self value in the closure.

But nothing stops us from using plain records,

  module Widget = struct
    type t = {draw : Canvas.t -> unit}
    let create draw self = {draw=draw self}
    let draw {draw} canvas = draw canvas
  end

well, in fact, since we have only function, why not just represent a widget as Canvas.t -> unit function,

  module Widget = struct
    type t = Canvas.t -> unit
    let create draw self canvas = draw self canvas
    let draw draw canvas = draw canvas
  end

So now you might have a question why pick GADT even we have much simpler solutions? A good question, as indeed in our example all our methods were covariant, i.e., the were consuming widgets but not producing. But in general, we would like to have methods that are contravariant, i.e., that create widgets, or, invariant that consume and produce widgets. For example, suppose that we added a quite natural method called move (and also decided to encode our interface as a module type),

So here is our new Widget module type,

  module Widget : sig
    type t

    module type S = sig
      type t
      val draw : t -> Canvas.t -> unit
      val move : t -> Point.t -> t
    end

    val create : (module S with type t = 'a) -> 'a -> t
    val draw : t -> Canvas.t -> unit
  end

Notice that move is an invariant method it takes a widget and returns a widget. Now let’s solve this challenge using GADT,

module Widget = struct

    module type S = sig
      type t
      val draw : t -> Canvas.t -> unit
      val move : t -> Point.t -> t
    end


    type t = Widget : {
        methods : (module S with type t = 'obj);
        self : 'obj;
      } -> t

    let create methods self = Widget {self; methods}
    let draw (Widget {methods=(module W); self}) canvas =
      W.draw self canvas

    let move (Widget {methods=(module W); self}) p = Widget {
        self = W.move self p;
        methods=(module W);
      }
  end

So far so good. But now let’s try to tackle it with objects, we define our abstract type as,

    type t = <
      draw : Canvas.t -> unit;
      move : Point.t -> t;
    >

and we fail even on the create function,

    let rec create (module W : S) self = object
      method draw canvas = W.draw self canvas
      method move point = 
        create (module W) (W.move self point)
    end

as it yields,

This expression has type 'a but an expression was expected of type W.t
The type constructor W.t would escape its scope

We can cope with this by using locally abstract data types and a recursive function,

    let rec create : type o. (module S with type t = o) -> o -> t =
      fun (type obj) (module W : S with type t = obj)
        (self : obj) ->
        object
          method draw canvas = W.draw self canvas
          method move point =
            create (module W : S with type t = obj) (W.move self point)
        end

But this looks unnecessary painful and also this recursion… it doesn’t look nice. And the further we go the more painful it will.

So the main feature of GADT is that (1) it enables us to easily implement t -> t transformations. Moreover, we can even recover the original type, when necessary. And even implement an abstract type without storing methods in its representation.

That is because the book was written in 2011-2013, and GADT landed in the language only in 2012. By that time first-class modules were the best way of implementing existential.

1 Like

That’s why I always found that using objects to solve this design question has been one of the worst idea I’ve ever seen in the history of human thought. Since I’m a mathematician, this is not your widget example I have in mind, but a basic algebraic structure hierarchy, starting for instance with the notion of group:

module type Group = sig
  type t
  val zero : t
  val add : t -> t -> t
  val neg : t -> t
end

And when I said previously that the first book that illustrate this methodology was Book V of Eucid’s Elements, I refer to the work of Eudoxus of Cnidus that was perfectly described by Jean Dieudonné in this conference (p.14-15, sorry but it’s in french).

Sometimes I have the impression that programmers are reinvented the wheel, but make it square. And I’m still not convinced that a square is a perfect shape for a wheel.

1 Like

It took me some times to find a response to this question that I hope will be suitable and intelligible.

TL; DR : the existential is the limit of the series of types you will define if you use the technique with extensible variants or polymorphic ones. Each time you add a new shape with one of this technique, you end up with a type which has a finite number of shape’s kind., and any of this finite sum type is a subtype of this existential. In others words, any kind of shape you will ever imagine already belongs to this existential sum type.

A little bit more explanation. Consider this module:

module M : sig
  type t
  int : int -> t
  float : float -> t
end = struct
  type t = Int of int | Float of float | String of string
  let int i = Int i
  let float j = Float j
end

The type int and float are subtypes of the type M.t, and the two functions int and float are the way we inject them in it. But for sure, there is non useful constructor String. Even if M.t is a common supertype of int and float, it is not the best one. Among all their common supertypes, the lowest one is their sum (up to isomorphism). With polymorphic variants, there is a built-in notion of subtyping, but it is very restricted: the only isomorphism it considers is the identity.

Now we have a notion of 'a shape_object, and we can define for instance a point shape_obejct, a circle shape_object and so on. Then comes this question: what is the lowest supertype of this infinity of shape_objects? Answer: the existential I define.

You can think of this existential as a variant with an infinity of constructors. And you can ask: where are this infinity of constructors? Answer: they are here:

to_shape : (module Shape with type t = 'a) -> 'a -> shape

Hence, the constructor for a point is to_shape (module Point), the one for a circle is to_shape (module Circle), and so on. No need for extensible variants or polymorphic ones to define this infinity of shape’s kind. To define a new shape’s kind you only need to implement a compilation unit that satisfies the Shape interface and apply the to_shape constructor.

When I wrote this

[to_shape (module Point) p1; to_shape (module Circle) r1]

it is conceptually similar to this one:

[(p1 :> shape); (r1 :> shape]

in order to define an homogeneous list of shapes.

I hope this time, it is clear to you what this existential really is.