I recently decided to change my indentation style for certain OCaml constructs in a way that I’m going to describe below. I just coined a name for this approach, “tail cascade”. I’m creating this topic to convince everyone that this is a cool idea you should adopt as well. Or at least tolerate it when you review other people’s code.
Problem
Programs that heavily use match
often see a shift to the right due to nested indentation.
match foo with
| Foo -> ...
| Bar x ->
match bar x with
| FooBar -> ...
| Blah y ->
match f y with
| Some z ->
...
Another problem with this style is that it suffers from the “dangling bar” issue: if you try to add a new case for one of the exterior match
, it is parsed as belonging to the innermost match
. People have been recommending (rightly) to use begin match .. end
for all nested match constructs to avoid this issue.
match foo with
| Foo -> ...
| Bar x ->
begin match bar x with
| FooBar -> ...
| Blah y ->
begin match f y with
| None -> ...
| Some z ->
...
end
(* now this is safe *)
| FooBlah -> ...
end
But still the unpleasant shift to the right remains.
Proposal: cascading tail case
We should in general use begin match .. end
for nested matches. But the “cascading tail case” proposal is to not do it for the last case of the pattern-matching, and instead de-indent (dedent) this last case – tail case.
match foo with
| Foo -> ...
| Bar x ->
match bar x with
| FooBar -> ...
| Blah y ->
match f y with
| None -> ...
| Some z ->
...
Note that with this indentation style, the “dangling match” problem is also avoided: unlike with the original, non end
-protected program, the indentation makes it immediately obvious that any further case will be attached to the innermost match, and not any of the exterior ones.
A program using this “cascading tail” approach should always use begin match .. end
for nested matches, except for a nested match returned within the last branch of an outer match, which can (optionally) be dedented instead.
The choice to dedent the last case corresponds to encouraging a sequential reading of the program, where the other cases are “auxiliary cases” checked first and dispatched quickly, and the last case is the “main part” where the “rest” of the logic of the program lies. This pattern is typical of nested pattern-matching on the option
or result
type for example:
match foo x with
| Error err ->
fail_foo_error err
| Ok y ->
match bar y with
| Error err ->
fail_bar_error err
| Ok () ->
...
Remark: it is not always the case that the Error
constructor is the auxiliary case, and the Ok
constructor is the main case; sometimes we implement fallback logic like “if foo
work then we are good, but otherwise we have to do this and that”, and the error case is the most salient (and longer) part of the program logic. I would recommend being mindful, when you write code, of whether there is a most convincing way to “sequentialize” it (distinguish auxiliary and main/tail case), and avoid using cascading tails when there is no clear sequentialization choice.
Remark: some cases of tail cascades can be linearized by using a good definition of “bind” and a monadic style. This tends to be very limited however: it fixes one of the constructors to always be the “tail” constructor (always Some
, always Ok
), and it only works when the handling of the other constructors is very homogeneous (typically: return directly). In real code, many situations occur where the monadic style doesn’t fit the problem, but tail cascade does help writing a readable program.
Generalization: tail cascade
While I have never seen cascading tail cases in real-world OCaml code before (I’m happy to be given pointers; I think that the idea is not new, but I’m not aware of previous attempts to give it a catchy name and spread the cascade love), this is in fact a new (to me) instance of a common technique that is used for other OCaml constructs:
if foo x then ...
else if bar x then ...
else ... (* this `tail else` was dedented *)
let x = foo in
let y = bar in (* this `tail let` was dedented *)
... (* and the rest as well *)
bind foo @@ fun x ->
bind bar @@ fun y -> (* this "tail function body" was dedented *)
... (* and the rest as well *)
I would call “tail cascade” (or maybe: “cascading tail”) the idea of dedenting the “rest” of an OCaml expression (compared to a strict tree-nesting-based approach) when it morally describes the “rest” of the expression. I use the name “tail” because those expressions are almost always in tail-position in the sense of tail-calls.
This general approach legitimizes some styles that I have seen, and sometimes used, in the wild, while at the same time considering that I may have been doing something improper, for example:
if foo then blah else
... (* dedented *)
Fun.protect
~finally:(...)
@@ fun () ->
... (* dedented *)
try simple_approach with exn ->
... (* dedented *)
1 +
2 + (* dedented *)
... (* dedented *)
Remark: after a then
or else
, many people share the reasonable view that any expression containing imperative constructs (foo; bar
) should be enclosed in a begin .. end
block to avoid surprising-precedence issue. Just as for nested match
, this recommendation should be lifted for “tail else” constructs.
Remark: The last example is a case where the dedented expressions are not in tail-position from a runtime-evaluation point of view. I am not sure as whether the two notions should be made to coincide more strongly, but in any case I’m not fond of the style in this particular example, I prefer to move the infix operator to the beginning of the next line instead, following a different style and justification.
The possibility this “cascading tail” style today crucially relies on the nesting properties of open-ended syntactic constructs, notably let
(commonly cascaded), and now match
and if ... else
. Proposals to transition to a syntax where match
and else
are forced to take a closing marker are incompatible with the cascading style. I have not made my mind on whether this should be considered a blocker for those proposals, but at least it shows that having the open-ended form available has value for certain programs.