Tail cascade: a new indentation style for some OCaml constructs

I recently decided to change my indentation style for certain OCaml constructs in a way that I’m going to describe below. I just coined a name for this approach, “tail cascade”. I’m creating this topic to convince everyone that this is a cool idea you should adopt as well. Or at least tolerate it when you review other people’s code.

Problem

Programs that heavily use match often see a shift to the right due to nested indentation.

match foo with
| Foo -> ...
| Bar x ->
  match bar x with
  | FooBar -> ...
  | Blah y ->
    match f y with
    | Some z ->
      ...

Another problem with this style is that it suffers from the “dangling bar” issue: if you try to add a new case for one of the exterior match, it is parsed as belonging to the innermost match. People have been recommending (rightly) to use begin match .. end for all nested match constructs to avoid this issue.

match foo with
| Foo -> ...
| Bar x ->
  begin match bar x with
  | FooBar -> ...
  | Blah y ->
    begin match f y with
    | None -> ...
    | Some z ->
      ...
    end
  (* now this is safe *)
  | FooBlah -> ...
  end

But still the unpleasant shift to the right remains.

Proposal: cascading tail case

We should in general use begin match .. end for nested matches. But the “cascading tail case” proposal is to not do it for the last case of the pattern-matching, and instead de-indent (dedent) this last case – tail case.

match foo with
| Foo -> ...
| Bar x ->
match bar x with
| FooBar -> ...
| Blah y ->
match f y with
| None -> ...
| Some z ->
...

Note that with this indentation style, the “dangling match” problem is also avoided: unlike with the original, non end-protected program, the indentation makes it immediately obvious that any further case will be attached to the innermost match, and not any of the exterior ones.

A program using this “cascading tail” approach should always use begin match .. end for nested matches, except for a nested match returned within the last branch of an outer match, which can (optionally) be dedented instead.

The choice to dedent the last case corresponds to encouraging a sequential reading of the program, where the other cases are “auxiliary cases” checked first and dispatched quickly, and the last case is the “main part” where the “rest” of the logic of the program lies. This pattern is typical of nested pattern-matching on the option or result type for example:

match foo x with
| Error err ->
  fail_foo_error err
| Ok y ->
match bar y with
| Error err ->
  fail_bar_error err
| Ok () ->
...

Remark: it is not always the case that the Error constructor is the auxiliary case, and the Ok constructor is the main case; sometimes we implement fallback logic like “if foo work then we are good, but otherwise we have to do this and that”, and the error case is the most salient (and longer) part of the program logic. I would recommend being mindful, when you write code, of whether there is a most convincing way to “sequentialize” it (distinguish auxiliary and main/tail case), and avoid using cascading tails when there is no clear sequentialization choice.

Remark: some cases of tail cascades can be linearized by using a good definition of “bind” and a monadic style. This tends to be very limited however: it fixes one of the constructors to always be the “tail” constructor (always Some, always Ok), and it only works when the handling of the other constructors is very homogeneous (typically: return directly). In real code, many situations occur where the monadic style doesn’t fit the problem, but tail cascade does help writing a readable program.

Generalization: tail cascade

While I have never seen cascading tail cases in real-world OCaml code before (I’m happy to be given pointers; I think that the idea is not new, but I’m not aware of previous attempts to give it a catchy name and spread the cascade love), this is in fact a new (to me) instance of a common technique that is used for other OCaml constructs:

if foo x then ...
else if bar x then ...
else ... (* this `tail else` was dedented *)

let x = foo in
let y = bar in (* this `tail let` was dedented *)
...            (* and the rest as well *)

bind foo @@ fun x ->
bind bar @@ fun y -> (* this "tail function body" was dedented *)
...                  (* and the rest as well *)

I would call “tail cascade” (or maybe: “cascading tail”) the idea of dedenting the “rest” of an OCaml expression (compared to a strict tree-nesting-based approach) when it morally describes the “rest” of the expression. I use the name “tail” because those expressions are almost always in tail-position in the sense of tail-calls.

This general approach legitimizes some styles that I have seen, and sometimes used, in the wild, while at the same time considering that I may have been doing something improper, for example:

if foo then blah else
... (* dedented *)


Fun.protect
  ~finally:(...)
@@ fun () ->
... (* dedented *)


try simple_approach with exn ->
... (* dedented *)


1 +
2 + (* dedented *)
... (* dedented *)

Remark: after a then or else, many people share the reasonable view that any expression containing imperative constructs (foo; bar) should be enclosed in a begin .. end block to avoid surprising-precedence issue. Just as for nested match, this recommendation should be lifted for “tail else” constructs.

Remark: The last example is a case where the dedented expressions are not in tail-position from a runtime-evaluation point of view. I am not sure as whether the two notions should be made to coincide more strongly, but in any case I’m not fond of the style in this particular example, I prefer to move the infix operator to the beginning of the next line instead, following a different style and justification.

The possibility this “cascading tail” style today crucially relies on the nesting properties of open-ended syntactic constructs, notably let (commonly cascaded), and now match and if ... else. Proposals to transition to a syntax where match and else are forced to take a closing marker are incompatible with the cascading style. I have not made my mind on whether this should be considered a blocker for those proposals, but at least it shows that having the open-ended form available has value for certain programs.

5 Likes

Interesting. Indeed I like this style for e.g. else, but I hadn’t thought about using it for nested matches.
An interesting property, to me, is also that it reduces the amount of reindentation when refactoring code.

Some thoughts went into this when designing ocp-indent, which has indeed a few options for else. If strict_else is set to auto, the “dedent” will be done only if a specific “unclosable” construct follows the else (ie let, match, try, fun, function). This prevents mistakes with e.g.

if foo x then ...
else
  print_endline "oops";
visibly_out_of_else

Careful support has been added for tail-functions as well.

I think a similar approach could be added to match without too much trouble, e.g. by limiting the dedent to cases starting with match ?

One specific case is handled already though, but limited in scope:

  try foo x with E ->
  try bar y with F ->

this only happens with with Xx -> is on the same line, after which ocp-indent assumes that there will be a single case.

EDIT: the last feature explained above is also a reason why this may be problematic to implement automatically in general: there is no way for ocp-indent to tell if you are in the last case of your match or not (it won’t do that kind of look-ahead). Doing the deindent in the middle of the match would be awful.
Thankfully, limiting to direct nested match (i.e. Pat -> match ...) ensures that is not the case. It just wouldn’t be possible for things like Pat -> let _ = _ in match ...
Interactively, you can always manually indent the first line and ocp-indent will understand and adjust
to your intent for the next ones, if you really want to do that, though.

Lovely. I’d never thought about it this way: you’ve opened my eyes.

I also found myself doing something similar a while ago, I think the code is gone but it was something like

match x () with
| Some v -> v | None ->
match y () with
| Some v -> v | None ->
match z () with
| Some v -> v | None ->
...

I don’t know if it morally described the purpose of the expression (I replaced it with some kind of monadic thing later out of guilt) but I think I can see the value in this style; most code I’ve seen leaves the nested matches as the last case to avoid using begin/end or parenthesis so this feels like a natural extension of that.

@gasche I prototyped a dedicated option in ocp-indent, if you’re interested in trying it out :slight_smile:

opam pin git+https://github.com/OCamlPro/ocp-indent#match-tail-cascade
echo "match_tail_cascade=true" >> ~/.ocp-indent
1 Like

While I have never seen cascading tail cases in real-world OCaml code before (I’m happy to be given pointers

I used to use this style. Here’s an example in Mirage:

However, it’s hard to do it now that I use ocp-indent.

While I’m keen to avoid indentation in a lot of the cases you mention, I’m not very fond to use it for match: I like the fact that the indentation suggests the nesting level of your match.

2 Likes

@talex5 the new option would allow you to do it automatically for the first case (match to_fragments rest), but not the second, which starts with a let, and could syntaxically be not the last. (Although, as I pointed out, if you manually indent just the let first_size line, the rest will follow when indenting interactively)

ocamlformat supports this style, for the match part at least. See https://github.com/ocaml-ppx/ocamlformat/pull/827. It also supports not adding indentation in cases like this (the key is to use @@):

with_foo a @@ fun x ->
with_bar b @@ fun y ->
...

I’ve done precisely this style for years in my little ocaml projects, it’s always seemed natural at least…

This style is also used in some large Coq proofs, such as mathcomp [when doing a proof by cases the principal case is not indented] Indeed seems very useful.