Stack vs List Performance

I found this exercise in a data structures and algorithms textbook/interview prep book.

The idea of the exercises is that towers_of_hanoi should take a natural number n and return the sequence of moves that is appropriate in order to migrate a tower of n disks from the left peg to the right peg.

I implement it three ways:

  • Each sequence recursively defines simpler sequences, and combines them with Seq.append
  • using lists as a “pure stack” to track the progress of the algorithm
  • using a mutable stack to track the progress of the algorithm
type peg =
  | LL
  | MM
  | RR

let other (from : peg) (to_ : peg) : peg =
  match from, to_ with
  | LL, MM -> RR
  | LL, RR -> MM
  | LL, LL -> assert false
  | MM, LL -> RR
  | MM, RR -> LL
  | MM, MM -> assert false
  | RR, LL -> MM
  | RR, MM -> LL
  | RR, RR -> assert false

(** Move the top n rings from [from] to [to_] using [third]
    as an intermediary.
    Assumption: The top n rings at [from] are
    smaller than all other rings at [to_] and [third].
*)
let rec towers_of_hanoi_helper n from to_ third : (peg * peg) Seq.t =
  if n = 1 then Seq.return (from, to_) else
  (* if n <= 0 then fun () -> Seq.Nil else *)
    let s1 = towers_of_hanoi_helper (n-1) from third to_ in
    let s2 = Seq.return (from,to_) in
    let s3 = towers_of_hanoi_helper (n-1) third to_ from in
    Seq.append s1 @@ Seq.append s2 s3
;;

(**
   The initial state of the system is that you have
   three stacks, LL, MM, and RR, where the second
   and third stacks are empty, and the first
   stack contains the numbers 1...n in
   increasing order.

   The problem is to give a sequence of "moves"
   defined as ordered pairs (pos1,pos2)
   describing the sequence of transfers that leads
   to the stack ending up at the RR pole.

   Some basic observations:
   - If you want to move the bottommost ring
     in position LL to the bottommost ring in position
     RR, a necessary prerequisite is
     to move the whole tower *but* the bottommost ring,
     to position MM.
*)
let towers_of_hanoi (n : int) : (peg * peg) Seq.t = 
  towers_of_hanoi_helper n LL RR MM

let towers_of_hanoi_v2 (n : int) : (peg * peg) Seq.t =
  (* let ell : (peg * int * peg) list ref = ref [LL,n,RR] in *)
  let rec f (ell : (peg * int * peg) list) () =
    match ell with
    | (from, 1, to_) :: tl -> 
      Seq.Cons((from, to_), f tl)
    | (from, n, to_) :: tl ->
      let third = other from to_ in
      f ((from, n-1, third) :: (from, 1, to_) :: (third, n-1, to_) :: tl) ()
    | [] -> Seq.Nil
  in 
  f [LL, n, RR]

let towers_of_hanoi_v3 (n : int) : (peg * peg) Seq.t =
  let s : (peg * int * peg) Stack.t = Stack.create () in
  Stack.push (LL, n, RR) s;
  let rec f () =
    try
      let (from, n, to_) = Stack.pop s in
        if n = 1 then Seq.Cons((from,to_), f) else
          let third = other from to_ in 
          Stack.push (third, n-1, to_) s;
          Stack.push (from, 1, to_) s;
          Stack.push (from, n-1, third) s;
          f ()
    with
    | Stack.Empty -> Seq.Nil
  in 
  f

When I benchmark this, the time to iterate through the sequences while checking that all moves are legal is:

Time v1 : 12.050855
Time v2 : 0.656658
Time v3 : 1.324983

I can understand why the use of Seq.append is very slow because it involves many recursively nested closures. But I wonder if someone could tell me why the pure version with linked lists is twice as fast as the imperative version with a mutable stack.

I don’t know if this will get us somewhere, but as a hunch, if it’s easy for you to re-run, could you try a version where the try-with is more local in v3? Like a

match Stack.pop s with
| exception Stack.Empty -> Seq.Nil
| (from, n, to_) -> ...

I’m not super confident about the impact a priori but at a quick glance I’d tend to be a bit worried about the try with surrounding the recursive call.

Yes, that’s a really good point. I hadn’t thought of that and I also hadn’t really realized up until this point the way match e with exception A → | b → increases the expressiveness of the language.

After the change, the timing is

Time v2 : 0.422593
Time v3 : 0.736088

That’s not really negligible, the difference jumped from 100% to 70%

New code

let towers_of_hanoi_v3 (n : int) : (peg * peg) Seq.t =
  let s : (peg * int * peg) Stack.t = Stack.create () in
  Stack.push (LL, n, RR) s;
  let rec f () =
    match Stack.pop s with
    | exception Stack.Empty -> Seq.Nil
    | (from, n, to_) -> 
        if n = 1 then Seq.Cons((from,to_), f) else
          let third = other from to_ in 
          Stack.push (third, n-1, to_) s;
          Stack.push (from, 1, to_) s;
          Stack.push (from, n-1, third) s;
          f ()
  in 
  f

You may also be paying the cost of the write barrier, which is incurred whenever you modify a data structure.

Cheers,
Nicolas

1 Like

Do you have a good reference that discusses this and related issues? I’m not sure if this would be considered systems programming or computer architecture.

No, sorry, I don’t know of any good reference.

Cheers,
Nicolas

1 Like

Retrofitting Parallelism onto OCaml discusses the write barrier, and has references to prior related work.

1 Like

Not sure if it matters (much), but Stack offers constant-time length querying (Stack.length), which by-principle comes with runtime costs when pushing onto and popping from the stack (updating the counter).

This is why I wondered if using let stack = ref [] might make sense when you want to imperatively operate on a stack but aren’t interested in determining the current length of the stack.

1 Like