The shape design problem

Here’s a fun exercise. Imagine you have three shapes: Point, rectangle and circle, with the following properties:

  • Point: int x, int y
  • Rectangle: Point bottomLeft, Point topRight
  • Circle: Point center, int radius

Add two behaviours: area and draw.

How would you organize the implementation to enforce:

  • Cohesion: A class (or module) should be responsible for one thing only
  • Coupling: You should be able to isolate change in the system
  • Encapsulation: Classes shouldn’t share internal representation
  • Polymorphism: The design should be based on interfaces

It should be possible to do something like:

forall shapes as shape, do shape.draw(surface)

Or if you move out the drawing logic:

forall shapes as shape, do surface.draw(shape.getDrawData())

The simple solution in OCaml is to use ADT for each shape. This will make it easy to add new behaviour, just as expected for the expression problem in functional programming. You can also break out each shape into a separate module. This will make it easy to add new data (types of shapes). Are there any other alternatives?

The following question can be asked:

  • How easy would it be to move to a 3D representation?
  • How easy would it be to add a new shape, say, triangle?
  • How easy would it be to add a new behaviour to all shapes, like save/load from a SQL database?
4 Likes

IMO, when you have some different kinds of thing that should have the same type (test: does it make sense to have a list with elements of multiple kinds?) then an ADT is probably the right approach:

module Point = struct
  type t = { x : int; y : int }
end

module Rectangle = struct
  type t = { bottom_left : Point.t; top_right : Point.t }
end
module Circle = struct
  type t = { center : Point.t; radius : int }
end

module Shape = struct
  type t =
    | Point of Point.t
    | Rectangle of Rectangle.t
    | Circle of Circle.t
end

You could also have the record definitions inline in the definition of Shape.t.

In cases where you want a new kind of shape, but updating the type is inconvenient (for instance you don’t want to have to update a lot of functions that use it, or you don’t own the type definition), you can do this:

module Shape2 = struct
    type t = Triangle of Point.t * Point.t * Point.t | Shape of Shape.t
end

That doesn’t look super elegant in this case, but in practice I think either it make sense, or it’s a hack and the correct thing is to do the work of updating the original type.

Alright! So how would you define area_of_shape or draw_shape in your solution?

Just as functions in the Shape module (or somewhere else if that makes more sense for drawing).

In my experience that’s indeed the way to go, as @roddy illustrated.

I’ve found the expression “problem” not to be an actual problem in practice: having to edit all existing operations (shape etc) to add new shapes is not hard, esp. when the compiler walk you to all those places.

Also note that when you really need to, you can build the Shape.t constructor iteratively (adding constructors to the same Shape.t type from distinct modules) which helps with first-class modules in the few cases one has to resort to them:

module Shape = struct
  type t = ..
end
module Rectangle = struct
  type t = { bottom_left : Point.t ; etc... }
  type Shape.t += Rectangle of t
  ...

What’s hard in programming is not the expression problem, it is to refrain programmers to solve imaginary problems :wink:

1 Like

BTW, since you mention “the expression problem” I guess we all assumed you’ve heard of the somewhat “official” answer to this problem in OCaml, but if not, here is is: http://pauillac.inria.fr/~remy/project/expr/

1 Like

It’s an interesting problem, though be wary that there is certainly a bias in the design. These shapes are very much classifications, when it might be that a better model and solution to the broader yet-undiscovered problem doesn’t fit the class distinctions so well (as realized in sciences time and again). I shy away from classifying things and instead note attributes or properties, and allow any grouping or behavior to be contingent on those.

For example, when you move to 3D, you might find that you have a uniform representation for the render data, with a single render “function”. The difference was merely in the function used to generate the data. Collision data would also be “uniform”, with hierarchy, though with primitives like a Sphere, or conic section called out versus planar CGG or meshes. So, again, the difference becomes one of the data, not a programmatic “class”.

Anyway, for the problem as stated, I might take an object-based approach, using mixins for the eventual behaviors which would crop up… as Leo White clued me in to from Real World OCaml with a similar “shapes” problem: Classes - Real World OCaml. I mean, the problem as-phrased is almost screaming “I want objects with virtual interfaces” (even if that’s not what you will really be best off with in the end :wink: ).

Interesting problem not difficult to solve in OCaml. For sure, you could use the object system but I do not like it a lot, and it’s not really fun. To have more fun, and since it’s an expression problem, lets use the full expressiveness power of the OCaml type system : modules, first-class modules and GADT. :slight_smile:

First define your algebra on shapes as a module type:

module type SHAPE = sig
  type t
  val area : t -> float
  val draw : t -> unit
end

then implement your different kinds of shapes:

module Point : SHAPE = struct
  type t = {x : int; y : int}
  let area _ = 0.0
  let draw {x; y} = ...
end

module Circle : SHAPE = struct ... end
module Rectangle : SHAPE = struct ... end

Now, since you want containers for shapes (to iterate over them), you need to define the type of shapes, and that’s where we use GADT. According to your definition, a shape is something with an area that we can draw. Here’s how we define this in OCaml:

type shape = Shape : (module SHAPE with type t = 'a) * 'a -> shape

and now you can define, for instance, a list of shape:

let l = [Shape ((module Point), p1); Shape ((module Circle), c1); ...]

(* and iterate over it *)
List.iter (fun (Shape ((module M), v)) -> M.draw v) l

Basically a shape (as defined with a GADT) is similar to an object, for instance with a point:

let point x y = object
  val x = x
  val y = y
  method area = 0.0
  method draw () = ()
end

but here the instance variable (x and y) are really what define a point, and this type cannot be projected out of this object in the type system (that’s one of the reason I don’t like a lot the object sytem). And this object is just a flatten representation of the shape type

For a more detail use of this method see this thread.

13 Likes

That’s the voice of a programmer working alone. :joy: Imagine my situation, a PHP web app started 2003 with 300+ unique contributors with vastly different backgrounds. It becomes very important to isolate change and prepare for a healthy software evolution path.

2 Likes

Thanks! Someone on IRC recommended this solution. The only drawback is the clunky inclusion of the module together with the type. People in PHP-land also talk about moving draw or save/load to SQL out of the Shape class or module, but that’s easy to adjust.

The most basic solution:

type point = {
  x : int;
  y : int;
}

type rectangle = {
  bottom_left : point;
  top_right : point;
}

type shape =
    | Point of point
    | Rectangle of rectangle

let area_of_shape (s : shape) : int =
  match s with
    | Point p -> 0
    | Rectangle {bottom_left; top_right} -> 1

let l = [
    Point {x = 1; y = 2};
    Rectangle {bottom_left = {x = 2; y = 3}; top_right = {x = 3; y = 4}};
]

let areas = List.map (fun s -> area_of_shape s) l

Here’s another solution using GADT instead of ADT (no modules):

type point
type rectangle

type _ shape = 
  | Point : int * int -> point shape
  | Rec : point shape * point shape -> rectangle shape

let new_point () = Point (1, 2)

let new_rec () = Rec (Point (1, 2), Point (2, 3))

(** Existential wrapper *)
type any_shape = Any : 'a shape -> any_shape

let area_of_shape : type a. a shape -> int = fun s -> match s with
  | Point (x, y) -> 10
  | Rec (Point (bx, by), Point (tx, ty))-> 20

let l = [Any (new_point ()); Any (new_rec ());]

let areas = List.map (fun (Any s) -> area_of_shape s) l

And this is not a solution to your problem, it doesn’t solve all of your constraint. The representation is not hidden (no encapsulation, that’s why I use the module system and its abstract type facility), you will have to update your gadt each time you want to define a new shape and you cannot have a list of shapes (unless you wrap your gadt in another existential one, as I did in the other thread).

What I did is a mere translation and formalization in OCaml of what you wrote in plain english in your first post.

Oh, sure! I just wanted to list another alternative. The biggest difference is number of changes needed to add a new operation vs a new shape. If you don’t hide the type, you only need one new function for a new operation, and the other way around.

Oh, I actually didn’t see this += construct before!

I don’t see the difference with the module version (which I’m still convince is the correct formalisation of your problem).

If you use ADT to define your type shape, then when you add a new function you will need to pattern match on each case of your variant:

let foo = function
  | Point -> code_for_point
  | Rectangle _ -> code_for_rectangle

but with the module version, you’ll have:

module Point = struct
  ...
  let foo p = code_for_point (* same code as above *)
end

module Rectangle = struct
  ...
  let foo r = code_for_rectangle (* same code as above *)
end

With ADT, contrary to the module approach, you will loose cohesion, coupling and encapsulation constraint. In your team this is not necessary the same persons who manage the Point module, the Rectangle one and the Shape one. So when you want to add a method on shape, the team who manage the Shape module need to be aware of all the implementation details of the different shapes in order to implement the method foo. That is not its job.

With module, the team in charge of the shape component will just do this:

(* file shape.ml *)
module type T = sig
  type t
  val area : t -> float
  val draw : t -> unit
  (* they add the foo method in the interface *)
  val foo : t -> foo
end

type t = Shape : 'a * (module T with type t = 'a) -> t

let area (Shape (v, (module M))) = M.area v

let draw (Shape (v, (module M))) = M.draw v

(* they will add this dispatch function to their code *)
let foo (Shape (v, (module M))) = M.foo v

(* rest of their code *)
...
end

and the person who have more knowledge about point and rectangle will have to implement the method foo in their respective module.

This way, the only thing developers of Point and Rectangle module need to know is the interface they must satisfy: Shape.T.

The developers of the Shape module need not to know anything about point, rectangle or any other kinds of shapes your team will add time after time.

And the user of the shapes need only to know how to define and construct a new shape (with the Shape constructor of the GADT in the Shape module) and the interface exposed by the Shape module in order to use them accordingly.

With your ADT approach you cannot divide the concern as I explain above.

Finally, when you add a new shape, say a triangle, you won’t have to update anything with module, but you will have to update your ADT and the code of each method to deal with this new kind of shape.

3 Likes

Sure, I agree the module approach has better qualities over all. :slight_smile: The only benefit to the ADT implementation is exactly that, isolating change when adding a new operation. Which might make sense in a domain where the data never changes.

By the way, we didn’t look at a solution where the functions are in a record… I will test it.

With some help from IRC, here’s a module solution with function inside record:

type 'a s = {
  data : 'a ;
  area : 'a -> int
}
 
module type SHAPE = sig
    type t
    val make : unit -> t s
end
 
type any_shape = Any : 'a s -> any_shape
 
module Point : SHAPE = struct
    type t = {x : int; y : int}
 
    let make () =
        {
            data = {x = 1; y = 2};
            area = (fun p -> 0)
        }
end
 
module Rect : SHAPE = struct
    type t = {bottom_left : Point.t s ; top_right : Point.t s}
 
    let make () =
        {
            data = {bottom_left = Point.make (); top_right = Point.make ()};
            area = (fun p -> 1)
        }
end
 
let l = [Any (Rect.make ()); Any (Point.make ())]
 
let area =
  function
  | Any s ->
      s.area s.data
 
let areas = List.map area l

This solution is analogous to the one I describe. Your type 'a s is isomorphic to this one 'a * (module SHAPE with type t = 'a), it’s just a flatten representation of the same product type.

By the way, this design is basically the one use in jane street Core and Base library, those used in the book Real World OCaml.

If you take the Shape module it will have such an interface:

(* shape.mli *)
module type T = sig
  type t
  val area : t -> float
  val draw : t -> unit
end

(* this one is isomorphic to the type `any_shape` *)
type shape = Shape : 'a * (module T with type t = 'a) -> shape

(* what you can do with a shape *)
val foo : shape -> foo
val bar : shape -> bar

But if you remove the existential wrapper, shape is just a product type and if you curryfy the type of foo and bar you’ll end up with this signature:

(* shape.mli *)
module type T = sig
  type t
  val area : t -> float
  val draw : t -> unit
end

val foo : (module T with type t = 'a) -> 'a -> foo
val bar : (module T with type t = 'a) -> 'a -> bar

And this kind of signature is at the heart of Core and Base design. So you can be sure that it works and scales very well. :wink:

4 Likes

Hi @kantian, I like your solution a lot, as it seems to be fully compatible with the Tagless Final design.

When I see this thread, I miss the old mailing list, which allowed me to pin some valuable content and keep them in a dedicated folder. I do not know if there is a ressource/book which explain — not the OCaml syntax , but — such designs

In the chapter entitled “First Class Modules” of Jane Street’s Real World Ocaml, instead of using an existential type wrapper with a GADT to hold an instance and the functions to operate on it, it employs a wrapper module incorporating the instance data (in a variable named ‘this’) from which the first class module is constructed.

I have to say I find that approach a lot more intelligible. Until reading this line of postings I didn’t realize that you could have type variables in the constructor not appearing in the return type (although I see they are mentioned in the manual). At first sight it seems somewhat like casting around in the weeds of the type system; at any rate, it seems somewhat non-obvious to me. Perhaps I will get used to it.