Three observations, the first two are short, the second pretty long. This is involved, so I’ll use Markdown headers to demarcate sections. After I post this, I’ll do “append” this way, and I think it should be possible to show precisely how you get the List.rev
out of it.
The final List.rev
is just unwinding the stack
The reason that you end up doing a List.rev is simple: the accumulator is a replacement for the stack. And a stack is “unwound” by popping frames off it, executing whatever return-action is required at each stage. In many list-based tail-recursive functions, that return-action is to take whatever was on the stack, and cons it onto the front of the result. So you naturally end up with a List.rev. That said, it all depends on the algorithm you choose. The algorithm you started with does not naturally lead to a List.rev
(though your version of tail-recursion does).
Not tail-recursive
Your tail-recursive solution isn’t actually tail-recursive:
aux ((aux [] m) @ acc) t
isn’t tail-recursive – it uses the stack to suspend the flattening of t
, while flattening m
.
Your implementation uses left-to-right evaluation-order; I’ll follow that in my code below. Just noting it.
BTW, it also uses @
in a place that is definitely not tail-recursive in the normal meaning. consider a tree that is lop-sidedly left-recursive and deep. As you flatten it, you will end up executing @
over-and-over with large first-argument and small second-argument. This will incur massive copying. I’m not going to correct that in this solution, because that would be modifying the starting program, but it would be an obvious thing to correct it and then work thru the steps below.
Tail-recursion, step-by-step
So let’s do it again, this time in a mechanical way, to induce proper tail-recursion. This isn’t as hard as it seems: the steps are always the same:
- A-translate the code: this means give each nontrivial function-call that uses the stack a name with
let
. It also imposes evaluation-order (again, I’m going with left-to-right since that’s what you chose)
- CPS-translate
- defunctionalize: this means th take the continuations in #b and turn them into datastructures.
- Then I monomorphized, so that I could run with
#trace
and see the execution of the function step-by-step.
type 'a node =
| One of 'a
| Many of 'a node list
module Stacky = struct
let rec flatten = function
| [] -> []
| One o :: t -> o :: (flatten t)
| Many m :: t -> (flatten m) @ (flatten t)
end
module Stacky2 = struct
let rec flatten = function
| [] -> []
| One o :: t -> o :: (flatten t)
| Many m :: t -> List.append (flatten m) (flatten t)
end
module Atrans = struct
let flatten t =
let rec flatrec = function
| [] -> []
| One o :: t ->
let l = (flatrec t) in
o :: l
| Many m :: t ->
let l1 = (flatrec m) in
let l2 = (flatrec t) in
List.append l1 l2
in flatrec t
end
module CPS = struct
let flatten t =
let rec flatrec t kont = match t with
| [] -> kont []
| One o :: t ->
(flatrec t) (*AFTER_ONE*) (fun l ->
kont (o :: l)
)
| Many m :: t ->
(flatrec m) (*AFTER_MANY_HD*) (fun l1 ->
(flatrec t) (*AFTER_MANY_TL*) (fun l2 ->
kont (List.append l1 l2)
))
in flatrec (*INIT*) t (fun x -> x)
end
module Defunc = struct
type 'a kont_t =
INIT
| AFTER_ONE of 'a kont_t * 'a
| AFTER_MANY_HD of 'a kont_t * 'a node list
| AFTER_MANY_TL of 'a kont_t * 'a list
let rec flatrec t kont = match t with
| [] -> dokont kont []
| One o :: t ->
(flatrec t) (AFTER_ONE(kont, o))
| Many m :: t ->
(flatrec m) (AFTER_MANY_HD(kont, t))
and dokont kont arg = match kont with
INIT -> (fun x -> x) arg
| AFTER_ONE (kont, o) -> (fun l ->
dokont kont (o :: l)
) arg
| AFTER_MANY_HD (kont, t) -> (fun l1 ->
(flatrec t) (AFTER_MANY_TL (kont, l1))) arg
| AFTER_MANY_TL(kont, l1) -> (fun l2 ->
dokont kont (List.append l1 l2)
) arg
let flatten t = flatrec t INIT
end
module Mono = struct
type kont_t =
INIT
| AFTER_ONE of kont_t * string
| AFTER_MANY_HD of kont_t * string node list
| AFTER_MANY_TL of kont_t * string list
let rec flatrec t kont = match t with
| [] -> dokont kont []
| One o :: t ->
(flatrec t) (AFTER_ONE(kont, o))
| Many m :: t ->
(flatrec m) (AFTER_MANY_HD(kont, t))
and dokont kont arg = match kont with
INIT -> (fun x -> x) arg
| AFTER_ONE (kont, o) -> (fun l ->
dokont kont (o :: l)
) arg
| AFTER_MANY_HD (kont, t) -> (fun l1 ->
(flatrec t) (AFTER_MANY_TL (kont, l1))) arg
| AFTER_MANY_TL(kont, l1) -> (fun l2 ->
dokont kont (List.append l1 l2)
) arg
let flatten t = flatrec t INIT
end
Discussion
So you can see now that you have a list of “frames”, and each frame is one of:
- a string
- a node-list (remaining to be flattened
- a string-list (already-flattened)
And we can see that this really is a “machine” in the sense of the SECD or CEK machine.