Abstract data structure type

Dear OCaml community,

I am a computer science teacher and, while teaching OCaml programming, I use types for abstract data structures.

For example :

type 'a stack = { (* abstract imperative stack *)
    empty : unit -> bool;
    push : 'a -> unit;
    pop : unit -> 'a
};;    

A value of type stack must therefore implement those 3 functions.
Then, to create a stack implemented by an array :

let stack_of_array t = (* t is the array that will contain stack elements *)
    let n = ref 0 in {
    empty = (fun () -> !n = 0);
    push = (fun e -> if !n >= Array.length t 
            then failwith "Full stack"
            else (t.(!n) <- e; incr n);
    pop = (fun () -> if !n = 0
           then failwith "Empty stack"
           else (decr n; t.(!n)))
    }

let s = stack_of_array (Array.make 100 0) in (* example *)
s.push 4;
s.pop ()

I like this style because of the encapsulation, the OOP style, the ability to design algorithms working with any stack implementation. However, I never saw something like this anywhere and I wonder why.
My question is: I am doing it right ? Would it be considered a better practice using modules and functors (I don’t see why that would be more convenient though) ?

I don’t see anything wrong with it. Modules are strictly more powerful because you can expose type components in addition to value components, but if you don’t need the extra flexibility…

As to why this style is not used more often, I guess it is a cultural thing; OOP-style idioms are typically considered less readable/explicit than equivalent “functional” solutions.

Cheers,
Nicolas

1 Like

Once you get into some other data structures, modules and functors can offer better support. For example, a data structure of sorted lists. With a functor the types of lists sorted with respect to distinct comparison functions can be made incompatible. Another thing to consider is e.g. union of two sets, where the implementation wants to assume that both sets have the same underlying representation so that e.g. a merge of balanced trees can be used for union instead of relying on the exposed set interface and implementing union as repeated addition.

1 Like

If you wanted to implement a non-imperative stack, is there any way of keeping the underlying data structure abstract just using records? Or would you need to use modules for this one way or another?

Two nice features of specifying and implementing ADTs in modules, which I don’t think you can get out of records:

  • Structural sub-typing that lets implementations of richer specifications satisfy simpler ones
  • The related ability to quickly construct and extend structures and signatures by inclusion

Here’s an example:

(* Given a minimal (imperative) stack interface, which is a substructure of the interface exposed by Stdlib *)
module type Stack = sig
  type 'a t
  val create : unit -> 'a t
  val is_empty : 'a t -> bool
  val push : 'a -> 'a t -> unit
  val pop : 'a t -> 'a
end

(* We can specify a Deque interface the basis of the specification for the Stack  *)
module type Deque = sig
  include Stack
  val pop_back : 'a t -> 'a
  val push_back : 'a -> unit
end

(* Now, let's assume we have some code that needs a stack *)
module NeedsStack (S : Stack) = struct
  (* Does some stuff with the stack *)
end

(* And then suppose we are in an environment where we are given something implementing Dequeu *)
module Example (D: Deque) = struct
  (* We can use our Deque where a Stack is needed *)
  module M = NeedsStack (D)

  (* Also, Stdlib's Stack satisifies our interface *)
  module M' = NeedsStack (Stdlib.Stack)

  (* We can define a new stack on top of the Deque *)
  module AltStack : Stack = struct
    include D
    (* But we can change what pop means *)
    let pop = failwith "Unimplemented"
  end

  module M'' = NeedsStack (AltStack)
end

I find the connections between records and modules is neat, and kind of profound. I think Rossberg’s 1-ML paper paper is a very fun exploration of the connections. More knowledge folks around these parts can likely recommend other resources.

2 Likes

Thanks for your detailed answers ! You convince me to use modules (except for my students for which I don’t want to add difficulties).

2 Likes

Not a teacher myself, but an interesting thing is that the usage of first-class modules is really OOP style just with different terminology, which I believe is really helpful when “teaching” people about what is the core of objects.

Example of code that is quite syntax heavy but conceptually it’s pure OOP.

module type Stack_interface = sig
  exception Empty

  type value
  val push : value -> unit
  val pop : unit -> value
end
module Stack = struct
  (* constructor that allows inspection of internal state,
     think about it as a "protected" constructor *)
  let make_internal :
      type value.
      value list ref -> (module Stack_interface with type value = value) =
   fun state ->
    (module struct
      exception Empty

      type nonrec value = value
      let push value = state := value :: !state
      let pop () =
        match !state with
        | [] -> raise Empty
        | hd :: tl ->
            state := tl;
            hd
    end)

  (* constructor *)
  let make state = make_internal (ref state)

  (* constructor *)
  let empty () = make []
end

(* "inheritance" on interface *)
(* OCaml doesn't have recursive module types *)
module rec Clonable_stack_interface : sig
  module type S = sig
    include Stack_interface
    val clone :
      unit -> (module Clonable_stack_interface.S with type value = value)
  end
end =
  Clonable_stack_interface
module type Clonable_stack_interface = Clonable_stack_interface.S

module Clonable_stack = struct
  (* constructor *)
  let rec make :
      type value.
      value list -> (module Clonable_stack_interface with type value = value) =
   fun state ->
    let state = ref state in
    (* calls super constructor *)
    let module Stack = (val Stack.make_internal state) in
    (module struct
      (* inheritance *)
      include Stack
      let clone () = make !state
    end)

  (* constructor *)
  let empty () = make []
end

Very well. Now you can upgrade to the coq Module system.
Coq Module Type is like ocaml module type except now you can add any datatype laws to be proved.

Module Type Stack.

  Section local.

  Variable a : Set.

  Inductive t : Set.
  Parameter empty : t.

  Parameter push : a -> t -> t.
  Parameter top : t -> a.
  Parameter pop : t -> t.

  Axiom top_law : forall x : a, forall s : t, top (push x s) = x.
  Axiom pop_law : forall x : a, forall s : t, pop (push x s) = s.
  
  End local.

End Stack.
1 Like

I like this style because of the encapsulation, the OOP style, the ability to design algorithms working with any stack implementation. However, I never saw something like this anywhere and I wonder why.

If you’re going to do OO, you might as well use OCaml’s built-in support:

type 'a stack = < (* abstract imperative stack *)
  empty : bool;
  push : 'a -> unit;
  pop : 'a
>

let stack_of_array t = (* t is the array that will contain stack elements *)
  let n = ref 0 in
  object (_ : _ stack)
    method empty = (!n = 0)
    method push e =
      if !n >= Array.length t 
      then failwith "Full stack"
      else (t.(!n) <- e; incr n)
    method pop =
      if !n = 0
      then failwith "Empty stack"
      else (decr n; t.(!n))
  end

let () = (* example *)
  let s = stack_of_array (Array.make 100 0) in
  s#push 4;
  Printf.printf "> %d\n" s#pop

That avoids duplicating the vtable for every stack (you could also make n a mutable field instead of allocating a separate ref cell for it).

The main advantage of using modules is that it’s easier to see what the program does. In the OO style (yours or mine), every time you see a stack being used you have to work out all the places it could have come from to know what code will be called. In the functor style, the implementation is usually chosen once, statically, at the top of the file, not once per stack. So I think it’s better to avoid objects whenever you don’t need that flexibility.

3 Likes

Dear @fortierq,

I’m not sure you will add difficulties to your students with modules and functors. If I believe google, you’re teaching programming in the newly introduce section MP2i of CPGE. It’s a bit old to me (I was in MPSI more than 20 years ago), but I’m pretty sure your students will use these notions on a daily basis in mathematics (but with different words: a type is called a set and a module an algebraic structure).

For instance, the elementary structure of group is a module that satisfies this signature:

module type Group = sig
  type t
  val zero : t
  val add : t -> t -> t
  (* add x (neg x) = zero, for any x in t*)
  val neg : t -> t
end

Then, they will learn that when you have two groups (say G and H) then you can define their product group. This result directly translates to an OCaml functor:

module GroupProd (G : Group) (H : Group) = struct
  type t = G.t * H.t
  let zero = (G.zero, H.zero)
  let add (g, h) (g', h') = (G.add g g', H.add h h')
  let neg (g, h) = (G.neg g, H.neg h)
end

Then you can extend the notion of group with the ones of ring and field:

module type Ring = sig
  include Group
  val one : t
  val mul : t -> t -> t
end

module type Field = sig
  include Ring
 (* mul x (inv x) = one, for any x in t*)
  val inv : t -> t
end

and finally you get the notion of vector space:

module type Vector_space = sig
  (* elements of V.t are called vectors *)
  module V : Group
  (* elemetns of K.t are called scalars *)
  module K : Field
  val scale : K.t -> V.t -> V.t
end

They will even use first-class functor when they will study polynomial on an arbitrary ring:

(* the polynomial X^2 + X + 1 *)
let poly1 (type a) (module A : Ring with type t = a) x =
  let (+), ( * ) = A.add, A.mul in
  x * x + x + A.one
;;
val poly1 : (module Ring with type t = 'a) -> 'a -> 'a = <fun>

(* usage examples *)
poly1 (module Int) 2;;
- : int = 7

poly1 (module Float) 2.5;;
- : float = 9.75

You should check with their mathematics teacher, but I’m pretty sure that if they do not understand these notions they won’t have the level to go in second year, be it MPI or MPI* (at least it was the case at my time).

2 Likes

Hey @fortierq , I do this in one of my libraries which providers a buffer abstraction that can be a file buffer, string buffer, or a socket.

The upsides is that it allows you code to be dynamically dispatched. Downsides are that I don’t know how well the compiler can optimize these calls and it’s also really hard to know what code is being executed. Which is pretty standard fare for OOP. But this style does show up and is useful.