List reversal in tail-recursive functions?

So, I’m working through the 99 problems, and I was working on the flatten function.

type 'a node =
  | One of 'a 
  | Many of 'a node list

val flatten : 'a node list  -> 'a list

I got a non tail-recursive version working in no time.

let rec flatten = function
  | [] -> []
  | One o :: t -> o :: (flatten t)
  | Many m :: t -> (flatten m) @ (flatten t)

I wanted a tail-recursive implementation, so I went with my gut and wrote this.

let flatten l =
  let rec aux acc = function
    | [] -> acc
    | One o :: t -> aux (o :: acc) t
    | Many m :: t -> aux ((aux [] m) @ acc) t
  in
  aux [] l

But it returns the list in reverse.

flatten [ One "a" ; Many [ One "b" ; Many [ One "c" ; One "d" ] ; One "e" ] ];;

- : string list = ["e"; "d"; "c"; "b"; "a"]

This is not the first time I see a tail-recursive function needing a List.rev for it to work, but I have a hard time wrapping my head around why it works like this.


How do you think about these recursions, and how do you make sense of the List.rev calls that are needed?

My recommendation: execute the code in your head on a small example (two elements should be enough).

For “in my head” I actually write a reduction sequence in a text buffer of an editor – this is what I show to students at least.

1 Like

The reason, the way I see it, is that you can only easily add elements to a list at the beginning, the end of a list is not accessible. So when you build a list element by element, you will end up with a list of the elements in the opposite order to the order you processed them. But conversely, when you go through a list, given that only the beginning is accessible, you will most naturally access the elements in the order of the list. This is how the reversal arises. Just think of when you’re reading through something you have printed on loose paper. You read one page, put it aside, read the next page, put it on top of the first one. At the end of this process the pages are in the wrong order.

Non tail-recursive functions provide an easy way to avoid this reversal by processing the elements in reverse order (first recurse to get all the way to the last element, then process it). This is not easily done with tailrec functions or with imperative code (you would need to copy the list into an array or something like that).

Also, in this specific case, given that every application of (@) needs to traverse the first argument, I would expect that the tailrec function is faster. The tailrec function “goes through” every leaf once when it builds the result, then a second time when it reverses it. Unless I’m confused, the non-tailrec function goes through every leaf as many times as the leaf has ancestors where it is not in the last branch. actually no, every ancestor counts, because even some_list @ [] needs to traverse some_list.

Edit: scratch that, I didn’t check how you wrote your tailrec version but actually it’s not tailrec and if you do a tailrec version you need to keep track of several lists which will end up doing as many list cons’s as the non tail-rec one:

let flatten l =
  let rec aux acc now later =
    match now with
    | [] ->
      begin match later with
      | [] -> acc
      | first :: rest -> aux acc first rest
      end
    | One o :: rest -> aux (o :: acc) rest later
    | Many []  :: rest -> aux acc rest later
    | Many m :: rest -> aux acc m (rest :: later)
    in
  List.rev (aux [] l [])
1 Like

I think this may be more straightforward / clearer with Seq.unfold.

Three observations, the first two are short, the second pretty long. This is involved, so I’ll use Markdown headers to demarcate sections. After I post this, I’ll do “append” this way, and I think it should be possible to show precisely how you get the List.rev out of it.

The final List.rev is just unwinding the stack

The reason that you end up doing a List.rev is simple: the accumulator is a replacement for the stack. And a stack is “unwound” by popping frames off it, executing whatever return-action is required at each stage. In many list-based tail-recursive functions, that return-action is to take whatever was on the stack, and cons it onto the front of the result. So you naturally end up with a List.rev. That said, it all depends on the algorithm you choose. The algorithm you started with does not naturally lead to a List.rev (though your version of tail-recursion does).

Not tail-recursive

Your tail-recursive solution isn’t actually tail-recursive:

aux ((aux [] m) @ acc) t

isn’t tail-recursive – it uses the stack to suspend the flattening of t, while flattening m.

Your implementation uses left-to-right evaluation-order; I’ll follow that in my code below. Just noting it.

BTW, it also uses @ in a place that is definitely not tail-recursive in the normal meaning. consider a tree that is lop-sidedly left-recursive and deep. As you flatten it, you will end up executing @ over-and-over with large first-argument and small second-argument. This will incur massive copying. I’m not going to correct that in this solution, because that would be modifying the starting program, but it would be an obvious thing to correct it and then work thru the steps below.

Tail-recursion, step-by-step

So let’s do it again, this time in a mechanical way, to induce proper tail-recursion. This isn’t as hard as it seems: the steps are always the same:

  1. A-translate the code: this means give each nontrivial function-call that uses the stack a name with let. It also imposes evaluation-order (again, I’m going with left-to-right since that’s what you chose)
  2. CPS-translate
  3. defunctionalize: this means th take the continuations in #b and turn them into datastructures.
  4. Then I monomorphized, so that I could run with #trace and see the execution of the function step-by-step.
type 'a node =
  | One of 'a 
  | Many of 'a node list

module Stacky = struct

let rec flatten = function
  | [] -> []
  | One o :: t -> o :: (flatten t)
  | Many m :: t -> (flatten m) @ (flatten t)
end

module Stacky2 = struct
let rec flatten = function
  | [] -> []
  | One o :: t -> o :: (flatten t)
  | Many m :: t -> List.append (flatten m) (flatten t)
end

module Atrans = struct
let flatten t =
let rec flatrec = function
  | [] -> []
  | One o :: t ->
     let l = (flatrec t) in
     o :: l
  | Many m :: t ->
     let l1 = (flatrec m) in
     let l2 = (flatrec t) in
     List.append l1 l2
in flatrec t
end

module CPS = struct
let flatten t =
let rec flatrec t kont = match t with
  | [] -> kont []
  | One o :: t ->
     (flatrec t) (*AFTER_ONE*) (fun l ->
         kont (o :: l)
       )
  | Many m :: t ->
     (flatrec m) (*AFTER_MANY_HD*) (fun l1 ->
     (flatrec t) (*AFTER_MANY_TL*) (fun l2 ->
         kont (List.append l1 l2)
       ))
in flatrec (*INIT*) t (fun x -> x)
end

module Defunc = struct

type 'a kont_t =
  INIT
| AFTER_ONE of 'a kont_t * 'a
| AFTER_MANY_HD of 'a kont_t * 'a node list
| AFTER_MANY_TL of 'a kont_t * 'a list

let rec flatrec t kont = match t with
  | [] -> dokont kont []
  | One o :: t ->
     (flatrec t) (AFTER_ONE(kont, o))
  | Many m :: t ->
     (flatrec m) (AFTER_MANY_HD(kont, t))

and dokont kont arg = match kont with
    INIT -> (fun x -> x) arg
  | AFTER_ONE (kont, o) -> (fun l ->
         dokont kont (o :: l)
  ) arg
  | AFTER_MANY_HD (kont, t) -> (fun l1 ->
     (flatrec t) (AFTER_MANY_TL (kont, l1))) arg

  | AFTER_MANY_TL(kont, l1) -> (fun l2 ->
         dokont kont (List.append l1 l2)
       ) arg

let flatten t = flatrec t INIT
end

module Mono = struct

type kont_t =
  INIT
| AFTER_ONE of kont_t * string
| AFTER_MANY_HD of kont_t * string node list
| AFTER_MANY_TL of kont_t * string list

let rec flatrec t kont = match t with
  | [] -> dokont kont []
  | One o :: t ->
     (flatrec t) (AFTER_ONE(kont, o))
  | Many m :: t ->
     (flatrec m) (AFTER_MANY_HD(kont, t))

and dokont kont arg = match kont with
    INIT -> (fun x -> x) arg
  | AFTER_ONE (kont, o) -> (fun l ->
         dokont kont (o :: l)
  ) arg
  | AFTER_MANY_HD (kont, t) -> (fun l1 ->
     (flatrec t) (AFTER_MANY_TL (kont, l1))) arg

  | AFTER_MANY_TL(kont, l1) -> (fun l2 ->
         dokont kont (List.append l1 l2)
       ) arg

let flatten t = flatrec t INIT
end

Discussion

So you can see now that you have a list of “frames”, and each frame is one of:

  1. a string
  2. a node-list (remaining to be flattened
  3. a string-list (already-flattened)

And we can see that this really is a “machine” in the sense of the SECD or CEK machine.

1 Like

I thought it might be useful to back up to append and look at it in terms of tail-recursion. Here’s the same development that I did for your example, but for append. The progression this time is:

  1. recusive
  2. a-translated
  3. CPS
  4. defunctionalize
  5. replace 'a kont_t with 'a list

At the bottom, you can see the function dokont and immediately below it, I wrote out rev_append. As you can see, they’re identical. It is in this sense that the kont (the “accumulating parameter”) is really a stack, and the rev is really executing-as-we-unwind the stack.

module AppendRec = struct
let append l1 l2 =
  let rec apprec l1 = match l1 with
      [] -> l2
    | (h::t) -> h :: (apprec t)
  in apprec l1
end

module AppendA = struct
let append l1 l2 =
  let rec apprec l1 = match l1 with
      [] -> l2
    | (h::t) ->
       let r = (apprec t)
       in h :: r
  in apprec l1
end

module AppendCPS = struct
let append l1 l2 =
  let rec apprec l1 kont = match l1 with
      [] -> kont l2
    | (h::t) ->
       (apprec t) (fun r ->
           kont (h :: r))
  in apprec l1 (fun x -> x)
end

module AppendDefunc = struct

type 'a kont_t = INIT | AFTER of 'a * 'a kont_t

let append l1 l2 =
  let rec apprec l1 kont = match l1 with
      [] -> dokont kont l2
    | (h::t) ->
       (apprec t) (AFTER(h,kont))

  and dokont kont arg = match kont with
      INIT -> (fun x -> x) arg
    | AFTER(h, kont) ->
       (fun r ->
         dokont kont (h :: r)) arg

  in apprec l1 INIT
end

module AppendKonrAsList = struct

type 'a kont_t = 'a list

let append l1 l2 =
  let rec apprec l1 kont = match l1 with
      [] -> dokont kont l2
    | (h::t) ->
       (apprec t) (h :: kont)

  and dokont kont arg = match kont with
      [] -> arg
    | (h :: kont) ->
         dokont kont (h :: arg)

  in apprec l1 []
end

let rec rev_append l1 l2 = match l1 with
    [] -> l2
  | (h::t) -> rev_append t (h :: l2)

You might find this useful: a somewhat-fixed version of your tail-recursive version:

module Stacky3 = struct
let rec flatrec t acc = match t with
  | [] -> acc
  | One o :: t -> o::(flatrec t acc)
  | Many m :: t ->
     flatrec m (flatrec t acc)

let flatten t = (flatrec t [])
end

It’s still not tail-recursive, b/c it uses the stack as it recurses down the tree, but it accumulates the result, and does so in the correct order.

ETA: Another version. Olivier Danvy once explained to me how “flatten a list by rotations” is related to A-translation. It was 30 years ago, so I’ve forgotten the details – he probably wrote it up in one of his excellent papers (I always use his CPS-translation method, b/c so straightforward) but here’s your problem, solved using rotations, and oh-so-ready to turn into tail-recursion:

module RW = struct
let rec flatten = function
  | [] -> []
  | One o :: t -> o :: (flatten t)
  | Many m :: t -> flatten (m @ t)
end
1 Like