type 'a expr = LiteralF of int | UnaryF of 'a | BinaryF of 'a * 'a
let fmap f = function
| LiteralF i -> LiteralF i
| UnaryF e -> UnaryF (f e)
| BinaryF (e1, e2) -> BinaryF (f e1, f e2)
module Term(F:sig type 'a t val fmap: ('a -> 'b) -> 'a t -> 'b t end) = struct
type 'a t = In of 'a t F.t
let out = function In term -> term
let in_ t = In t
let (%>) f g x = g (f x)
let rec bottomUp fn =
out
%> F.fmap (bottomUp fn)
%> in_
%> fn
let rec bottomUp_eta fn =
out
%> (fun x -> F.fmap (bottomUp_eta fn) x)
%> in_
%> fn
end
module Expr = Term(struct type 'a t = 'a expr let fmap = fmap end)
open Expr
let _ = bottomUp_eta Fun.id (in_ @@ LiteralF 1)
let () = print_endline "Passed test 1"
let _ = bottomUp Fun.id (in_ @@ LiteralF 1)
let () = print_endline "Passed test 2"

I took batteries included syntax for forward function composition.
The problem is that while the function bottomUp_eta works completely fine, bottomUp apparently triggers an infinite recursion (it ends up with a stack overflow), I am completely puzzled by this behaviour, I was aware that sometimes eta expansion could help type checking (by ruling out the possibility of a weird reference behaviour) but until now I thought that as soon as some code compiles, eta expanded version against standard one would not make any runtime difference. The only thing I could have imagined would be related to tail-call optimization (maybe easier in the eta-expanded version) but in this case the example is so trivial that I canâ€™t imagine I would exhaust the stack this way.

(* loops when called with a single parameter *)
let rec loop x = loop x
let () = loop () (* loops *)
let _ = loop (fun x -> x + 1) (* still loops *)
(* eta-expanded version *)
let rec loop_eta x = fun y -> loop_eta x y
let _ = loop_eta (fun x -> x + 1) (* does not loop anymore *)

Oh, is it because in the second case, as the compiler knows loop_eta x is supposed to be a function it doesnâ€™t try to evaluate it further as it is still missing one argument ?

This very thing caught me out when I read this post a while back

Oh, is it because in the second case, as the compiler knows loop_eta x is supposed to be a function it doesnâ€™t try to evaluate it further as it is still missing one argument ?

Close. The evaluation doesnâ€™t really depend on whether â€śloop_eta x is supposed to be a functionâ€ť, but whether it reduces to a value or not. In both cases, the code behaves the same way: expressions are reduced until they are values.

Function applications are not values: they must be reduced to values by first reducing their arguments, applying the function, and continuing to evaluate from there. When evaluating loop (fun x -> x + 1), (fun x -> x + 1) is already a value, so the function is applied causing the body of loop to be reduced. This requires evaluating loop (fun x -> x + 1) again and so the evaluation never terminates.

Lambdas are values: the body of the lambda is reduced only when it is applied. Evaluating loop_eta (fun x -> x + 1) again involves applying the function (the argument is still a value). The difference is that the body of loop_eta doesnâ€™t need to be reduced: itâ€™s already the lambda value fun y -> loop_eta x y. So weâ€™re done. (If you pass two arguments to loop_eta, itâ€™s a different story.)

In short, eta-expansion in OCaml delays the evaluation of the expression youâ€™re expanding around. If your code never applies the eta-expanded term, the body will never be reduced. This is important when either (a) there are side-effects associated with the reduction or (b) the reduction is non-terminating (as is the case for your code above).

The Haskell code relies on the fact that fmap will eventually not need its first argument (for the ground terms in the algebra). In Haskell, this is enough, since the first argument is evaluated lazily (only when it is required). In OCaml, you need to ensure that the recursive call doesnâ€™t happen until fmap actually needs it, which you can do by eta-expanding.

let rec bottomUp fn =
out
%> F.fmap (fun x -> bottomUp fn x)
%> in_
%> fn

As youâ€™ve seen, it also works to eta-expand around the fmap. In this case, the code will make one more recursive call for ground terms, but since fmap never actually applies the function constructed by that recursive call there is no more work to do.