Ocamlformat: unexpected 4x indent for multi-line arguments

I’ve been playing with ocamlformat and am mostly very happy – thanks for the great tool! – but I am running into one unexpected behavior that I have been unable to tame. Here is an example:

Actual formatting (4x indentation):

  Array.fold gear_nums_mat ~init:0 ~f:(fun init gear_nums_row ->
      Array.fold gear_nums_row ~init ~f:(fun init gear_nums ->
          match gear_nums with
          | [x; y] -> init + (x * y)
          | _ -> init
      )
  )

Expected behavior (2x indentation):

  Array.fold gear_nums_mat ~init:0 ~f:(fun init gear_nums_row ->
    Array.fold gear_nums_row ~init ~f:(fun init gear_nums ->
      match gear_nums with
      | [x; y] -> init + (x * y)
      | _ -> init
    )
  )

The exact problem is reproduble with this .ocamlformat file:

profile = conventional
version = 0.26.1
indicate-multiline-delimiters = closing-on-separate-line
break-cases = toplevel

The 4x indent shows up even with an empty .ocamlformat file, albeit in slightly different form:

  Array.fold gear_nums_mat ~init:0 ~f:(fun init gear_nums_row ->
      Array.fold gear_nums_row ~init ~f:(fun init gear_nums ->
          match gear_nums with [ x; y ] -> init + (x * y) | _ -> init))

Can anyone shed light on what’s going on – i.e., why the 4x indent when most other things are indented by 2x only? Also, is there any way I can adjust this behavior to my liking?

I don’t have a lot of experience with ocamlformat, but this behaviour is laso followed by other tools, eg ocp-indent, and my understanding is that you want to minimize reflow of the rest of the source code if the tool or yourself decide to break a line. For example, if your code looks like:

f a (begin
    e
  end)

and you break the line just before the begin, you get

f a
  (begin
    e
  end)

and the body of the begin does not need to be reindented. (Things don’t actually work in exactly this way in the fun case because the treatment of fun involves a number of special heuristics, but this would be the general principle as I understand it.) Basically each argument is at “virtual” indent 2, and the fun adds another 2, so you get 4 for the body of the fun.

Cheers,
Nicolas

1 Like

Thanks for the intuition for how to get to 4; arguments having virtual indent 2, and fun adding another 2, makes sense to me.

While the algorithm makes sense, I feel pretty strongly that I would prefer 2x indent for this very common case of passing anonymous functions to other functions as the last argument. I am curious how others feel about this?

As a counter argument, note that 2x indent only makes sense when the passed function is the final argument – otherwise one can end up with awkward formatting such as this:

List.map ~f:(fun x ->
  some interesting operation on x)
  xs

or

List.map ~f:(fun x ->
  some interesting operation on x
)
  xs

which, as I’ll be the first to admit, looks off.

I agree that I would prefer indentation to always increment only by 2 (i.e. squash multiple indents coming from the same line).

For the particular case of anonymous functions as last arguments, however, I use the following style instead:

list_iter my_list @@ fun item_which_happens_to_be_a_list ->
list_iter item_which_happens_to_be_a_list @@ fun item ->
...

There is no indentation. If needed, I use parentheses:

(
  list_iter my_list @@ fun item ->
  ...
);
do_something_else ()

although I know some people prefer begin ... end in this case, which makes sense; and also, ocamlformat does not put those parentheses like I do. So I don’t use ocamlformat but ocp-indent.

This approach requires:

  • to define list_iter as Fun.flip List.iter;
  • or to write it as Fun.flip List.iter directly, which is annoying;
  • or to write it as my_list |> List.iter @@ fun item ->, which is also annoying and also type disambiguation does not work as well because the type of item is not known in the body;
  • the function argument cannot be labeled, hence why I don’t like suggestions to add ~f everywhere.

My secret hope is that everyone adopts this style and that we end up justifying variants like List.iter' = Fun.flip List.iter in the standard library, but that’s probably never going to happen :wink:

1 Like

The 4x indentation is needed if the argument list must break before the fun:

let _ =
  Array.fold gear_nums_mat ~init:(larger argument_____)
    ~f:(fun init gear_nums_row ->
      Array.fold gear_nums_row ~init ~f:(fun init gear_nums ->
          match gear_nums with
          | [x; y] -> init + (x * y)
          | _ -> init
      )
  )

It would be very hard for OCamlformat to reliably indent this case while also using the 2x indentation when possible.

This is also a case where a small change in the code could cause OCamlformat to reformat a large part of it, if the 2x indentation suddenly does not apply.

1 Like

… which is bad from a diff/blame point of view. :slightly_frowning_face: Makes sense.

I can see the appeal of the 0x-indent style when programming in a monad:

foo >>= fun x ->
bar >>= fun y ->
rerun (x + y)

These days I would use let* to get that formatting.

I don’t personally like the 0x-indent style for higher order functions like map or fold etc, where the inner logic operates on a different type than your top-level pipeline — elements of the list rather than the whole list — and would be the indented body of a for loop in an imperative language.

I thought a bit more about this:

I think it is not so bad:

  • I expect such reformatting would occur rather infrequently in practice.
  • Many diff/blame tools can ignore “whitespace-only changes” these days.

To make a concrete proposal, ocamlformat might provide the following new option:

--coalesce_indent_of_fun_literals_passed_as_final_arg
    Controls indentation of function literals, in the special case when a literal
    is the final argument of a function application whose first line break occurs
    just before the body of the function literal.
    If set, the body is single-indented relative to the line starting the function
    application and literal.
    If unset, the body is double-indented relative to the line starting the
    function application and literal: once for being part of the application and
    once for being part of the literal.

I hear you that this may be hard to implement, just thinking out loud.

One more follow-up question: I get this behavior even when I set

max-indent = 2

Is that expected behavior?

The max-indent option doesn’t work as expected as it fails to take indentation boxes that did not break. This is implemented in a modified version of the Format module from the stdlib, in vendor/ocamlformat_support/format_.ml, in case you want to have a look.

Wanting to indent the body of for-like loops certainly makes sense.

It seems @@ has special rules in ocp-indent. If I define let (@+) = (@@), ocp-indent gives me:

  list_iter [ 1; 2; 3 ] @+ fun x ->
    list_iter [ 4; 5; 6 ] @+ fun y ->
      Printf.printf "%d, %d\n%!" x y

which is nice. However ocamlformat gives me:

  list_iter [1; 2; 3] @+ fun x ->
  list_iter [4; 5; 6] @+ fun y -> Printf.printf "%d, %d\n%!" x y

which is definitely not what I want (but that may not be the default, I tested in a project which had custom ocamlformat rules).

Unfortunately, even ocp-indent fails to do what I want if I use parentheses:

  list_iter [ 1; 2; 3 ] @+ fun x -> (
      list_iter [ 4; 5; 6 ] @+ fun y ->
        Printf.printf "%d, %d\n%!" x y
  )

It uses 4 spaces to indent.

Looking back, I think that the main reason I started using @@ in the first place was because I didn’t like the 4 spaces to indent (whereas 0 spaces didn’t bother me even for for-like loops). I think that nowadays, with let* being available for monads, I would probably be happy with tooling that always use 2 spaces for any infix symbol, including @@ and >>=, on the condition that it does stay at 2 if I use parentheses.

The argument “but if we don’t put 4 spaces, it means that removing the parentheses suddenly means reindenting everything!” does not seem to hold for me. In the above examples, adding the parentheses changed the indentation, which is exactly what bothered me!

Nothing after the fun x -> can be of a lower scope at that level of indentation, it is unnecessary. OCamlformat removes the indentation in this case to support long chain of monadic-style operators that are widely used.

I made this argument in the context of ocamlformat.

Nothing after the fun x -> can be of a lower scope at that level of indentation, it is unnecessary. OCamlformat removes the indentation in this case to support long chain of monadic-style operators that are widely used.

Sure. I agree it’s useful for monadic-style operators. But now that we have let*, infix operators for monads are less useful. I would also agree that it’s fine to not indent, but @smolkaj says that it makes it look more like a for loop, which leads me to believe that it is not “unnecessary”: it can be argued that this is an improvement in readability for at least some people.

I made this argument in the context of ocamlformat.

Sorry, I realize my answer was confusing because the example I gave was using ocp-indent. And also because your answer was about the case where there are too many arguments to fit on one line, not the case where we remove parentheses. I don’t use ocamlformat because it does way too many things differently from what I would do myself, so it’s probably best I don’t discuss its choices here :slight_smile:

As in, this is a know bug? Is it worth opening a GitHub issue for it?

I think this issue describes the issue: Reduced indentation / more sensible "max indentation" behavior. · Issue #1116 · ocaml-ppx/ocamlformat · GitHub (in the last two comments)

1 Like