What I dislike about OCaml

Concur. What I think is a bad idea: a tool that insists there should be exactly one automatically enforced style for every project across the entire language ecosystem.

I think it’s a good thing that different communities can settle on different style conventions for the projects within their ambit, and it would be nice to have tools capable of supporting disparate conventions as different communities choose them. This exists in the C++ language ecosystem. I wish it existed in the OCaml ecosystem too, and I can’t say it does today.

1 Like

I concur.

I’m generally pro auto-formatters, but to be useful for the variety of human needs (programming is a human activity after all) they need to be configurable enough to support formatting conventions that a variety of different people find to support readability: this is an accessibility issue.

Recent developments in ocamlformat have made it too inflexible to help me with consistent readability, so I have also begun to drop use of it (after the maintainers made it clear they were not open to preserving features to help people who read as I do).

It’s obviously a non-trivial amount of work to develop and maintain a formatting tool, so while I would prefer a different outcome, I don’t begrudge them. It does open up space for an alternative formatting tool, for when and if someone has the time :slight_smile:

2 Likes

You are hopefully aware that ocamlformat supports multiple styles and there is no “one style” that it forces you into? Of course there are limitation what it can support but I often find ocamlformat formatted code easier to read than manually formatted code which often leaves me with the question: is this deliberate or accidental that it is formatted that way?

I don’t subscribe in that the formatting is a good way to encode information. It is way too subtle and getting “creative” in formatting tends to be readable for only the author who knows the intent behind it, the rest is distracted from the code by wrapping their head around why it is formatted this way, which actually detracts from understanding what the code says.

Sure, bad formatting is an excellent way to make code unreadable, but to make code readable I think good names, short functions and some kind of reasonable logic and if all else fails comments are way better tools to improve readability.

1 Like

These are my main pain point concerning ocamlformat:

  • I need to reinstall ocamlformat to whatever version is used by whatever project I’m switching to. (Yes I could have more local opam switches but then I have to deal with a different form of development friction.)
  • I need to add ocamlformat to the test dependencies of my projects so that the CI has it available, but then it’s a very tight dependency version constraint. (Yes, I could make a different CI that special-cases the ocamlformat check, but then I increase the complexity of the CI and the dev setup to keep everything in sync.)

I agree with that wholeheartedly. And I find autoformatters (all of them) frustrating. I only want to use an autoformatter when a project reaches a certain scale (especially in number of contributors) where there is significant frustration from having style conversations in reviews, offsetting the switch to an autoformatter.


All in all, I don’t have a gripe against the formatting that ocamlformat does. The packaging (the fact that newer versions are incompatible with older config files) makes it difficult to use. And I also have a mild dislike of autoformatters in general.

This is a painful part of the ocamlformat tool. However, this PR and the release of opam.2.2.0 should address this problem by adding a {with-dev-setup} variable. You will be able where you describe the ocamlformat version you want for a dev purpose.

3 Likes

Here are a few examples, in ReasonML, pardon my french:

  let rect: (~top:    [< `length | `zero | `auto],
             ~right:  [< `length | `zero | `auto],
             ~bottom: [< `length | `zero | `auto],
             ~left:   [< `length | `zero | `auto]) => shape;

  let f: (string => unit) => t;
  let g: (int    => unit) => t;

Is this a bad way of encoding the information that there are major similarities in these types that can be abstracted away when trying to understand what’s different in them? Do you have a better way of conveying that information? Do you think these are too subtle and “creative”? Do you find it distracting and harder to understand the code when formatted this way, compared to what the automatic formatting tool suggests?

  let rect:
    (
      ~top: [< | `length | `zero | `auto],
      ~right: [< | `length | `zero | `auto],
      ~bottom: [< | `length | `zero | `auto],
      ~left: [< | `length | `zero | `auto]
    ) =>
    shape;

  let f: (string => unit) => t;
  let g: (int => unit) => t;

I don’t deny that coding conventions are generally good, and that giving people full creative freedom in formatting is bad. But I’m trying to point out that by enforcing automatic formatting you lose something significant. And even if there are other benefits gained that usually outweigh that loss, that’s not always the case. For example, hand-formatting interface files can make a public library with complex types much more accessible.

2 Likes

No, it is not, but in both the examples it is pretty apparent to me that the polymorphic variant is repeated and that refactoring would probably take less time that adding 4 spaces in the first line, 2 in the second, 1 in the third, 3 in the forth and I guess in a hypothetical case that rect needs to be renamed or a fifth argument added even more manual work sculpting it to shape. It feels very much like a french garden, with carefully cropped branches. The code does look nice, but I would dread touching it because I would need to spend so much manual time making it “pretty”, similar to how people enjoy aligning = (or let forms in Clojure).

And honestly, in the f and g case I find the “conventional” way easier to read because it looks the way it is displayed everywhere when I ask for a signature. I am actually more confused by the spaces and wondering what information it is supposed to convey. That they both have two arguments? That both return unit? Does any of these observations matter or is my mind finding patterns where there is just a coincidence? Maybe the functions are related somehow and if I were to move them I should group them together? If this is important information, I feel a comment would’ve been actually clearer.

@raphael-proust I totally agree with how annoying dealing with ocamlformat can be, especially if you run into the case that the version of ocamlformat does not support your compiler, and to use a newer version you’d need to reformat your project. Also, it brings a lot of dependencies into your switch. OCaml-CI solves the CI dependency issue somewhat by just looking for an .ocamlformat file and installing the version of ocamlformat specified there specifically.

1 Like

Indentation (which is formatting) is a good way to encode AST-depth.
Line-breaks (which is formatting) is a good way to encode information about the shape of the AST.

An example where line-breaks can be useful for conveying information that ocamlformat breaks is when some expressions fall into significant groups but are flat in the AST.

Format.fprintf fmt "date: %a; time: %a; tags: %a; text: %s"
  Date.pp date
  Time.pp time
  Format.(pp_print_list ~pp_sep:pp_print_space pp_print_string)
    tags
  message
Process.exec "foo" [
  (* flags *)
  "--log-level"; "*:debug";
  "--base-dir"; "/tmp/foo/";
  "--recursive";
  (* other parameters *)
  "bar"; "baz"; "bax"
]

Essentially, sometimes, arguments are grouped in a way that make sense to replicate in the formatting. (Now obviously on such short examples it’s not a big deal and it’d be fine the way ocamlformat lays it out; but these are examples.)

Some interesting reading on the topic A visual perception account of programming languages :
finding the natural science in the art


EDIT: Notice how the Process.exec example mimmics the kind of line-break and indentation you’d find in a shell script or a makefile:

foo \
  --log-level '*:debug' \
  --base-dir /tmp/foo \
  --recursive \
  bar baz bax
2 Likes

The time it takes is irrelevant to the point I’m trying to convey. I’m trying to establish that there is value in manual formatting making code easier to read and understand. Only when that is established can we consider whether it’s valuable enough to be worth the effort it takes. And that will depend on the situation.

Also, I was hoping you’d have enough imagination to consider that the examples could be more complex, such as the polymorphic variants having more constructors that are not on common, for example. But alas…

Yes, and sometimes that is worth the effort. Other times it’s not. Not everything needs to be completely black and white.

It tries to convey that the function signatures are identical, except for a “slot” that can contain different concrete types.

A comment probably would be clearer. There is little that can beat the clarity of a precise textual explanation. But it’s not very efficient. You could decide to have all the functions in a file ordered alphabetically, for example, instead of grouped by functional similarity or some other property. And then have comments on each of them saying that “f, along with o, x and z are all jig-manipulators that behave exactly the same except for f taking a float while the others take their argument in other forms”. It would be entirely clear once you’ve gone through, read and understood all the comments, sure. But also incredibly inefficient.

In UI design, the principles of gestalt psychology are often used to convey that elements have some relation to each other. Not for clarity, but for efficiency. Through formatting those same principles can be applied to code.

There is good reason why the rigidly uniform widget toolkits devised by programmers are increasingly going out of style in favor of more expressive UI frameworks. And sure, more creative freedom does sometimes make for disastrously bad UIs. But also some that are much, much better.

I don’t think enforced style rules are what’s going to make bad programmers write good, understandable code. No more than I think rigid widget toolkits are going to make programmers create good UIs. There’s still going to be plenty of creative room for making crap.

image

(I feel this might have veered a bit off-topic though. Apologies to everyone!)

It sports more options now than ever before, but those multiple “styles” (profiles) are still what I would characterize as multiple minor variants on one basic theme. I can hope that it will continue to grow new options in forthcoming releases so that I might be able to adopt it without too much disruption to my existing style preferences.

It’s a bit of a shame that ocamlformat is going the other way. Implementation complexity and (IMHO) the somewhat unfortunate desire to mimic golang’s zero configuration go fmt is steering the tool to be less customizable with each version. I too wish that ocamlformat was a little more customizable (within reason)

5 Likes

I agree. Gofmt envy will lead nowhere, for several reasons:

  • go had it from the start, so there never were multiple styles
  • go has a much simpler syntax, for which it’s easy to pick a reasonably style.

OCaml has neither of these, and I think ocamlformat should keep giving people options to preserve their preferred style (or something close enough). I’d stop using it if it removed the options I use, I think.

3 Likes

We didn’t remove any options for a long time, and users are welcomed to open issues (and even PRs) if they think a new option should be created, or an existing option could use a new variant.

5 Likes

Would you be willing to re-open this issue? De-deprecate align-* features · Issue #1996 · ocaml-ppx/ocamlformat · GitHub

I think this points to part of the disconnect in points of view here: programming languages are tools for humans to read and write about programs, and humans are more than information processors. The text of a program is first and foremost a medium for communicating between humans, and only secondarily a source for encoded information. In a language that has the virtue of not being white-space sensitive, formatting is more like an accent, or cadence of speech. If all we cared about was encoding information, we could just write all our OCaml programs on one line :smiley:

Consider that depending on a person’s upbringing or native language(s) some accents are easier for them to understand than another. But this isn’t because of an innate information content of the utterances.

Consider that many people with less-usual sensory processing make ups benefit from text that uses special fonts, or has limited line widths, or ragged edges. The fact that alignment isn’t helpful for you personally, or doesn’t encode additional data doesn’t change the fact that many people find it a non-trivial aid to readability.

All that said, i really agree with you that what we all really want is structured editing that would allow people to view the source in whatever way is best for them :smiley: – a central point of friction with the auto-formatting thing is that all the regular readers and writers of the program have to find common ground.

That’s a confusing statement; previously-deprecated options were removed in 0.22, earlier this year. As a result, at least for a while, I’ve locked our ocamlformat version to 0.21; it’s TBD whether we end up upgrading eventually, or retreat to an alternative formatting mechanism.

(That deprecation and removal, and previous statements that ocamlformat’s focus would tend towards fewer options (and thus an easier-to-maintain implementation, and understandable meta-objective) is what causes this topic to resurface periodically.)

Aside from maintainability concerns, the regular mention of “diff-friendliness” (as in the comment @shonfeder linked here) as an overriding criteria for supporting any given formatting option is baffling to me. Lots of activities in programming create things that aren’t “diff friendly”, but that’s very rarely an important factor in deciding to do anything IME. I reach for formatting tools to enforce a degree of regularity in a codebase, and do so yielding something that is most readily scanned, comprehended, and edited. Worrying about whitespace in diffs feels quite irrelevant in comparison.

2 Likes

I reach for formatting tools to enforce a degree of regularity in a codebase, and do so yielding something that is most readily scanned, comprehended, and edited. Worrying about whitespace in diffs feels quite irrelevant in comparison.

The difference are not just in whitespace: closing brackets and delimiters are getting moved around. This matters when you want to backport code to an earlier release that was formatted in a different way - now patches no longer apply.

4 Likes

Ah, fair enough. For teams where such things are a concern, I guess it then makes sense to offer such things as options. :person_shrugging: :wink:

2 Likes

Just checking as another relative beginner in OCaml who wasn’t forced to use it!

The language is (almost) perfect. It doesn’t need early returns or better loops. OCaml was the first functional language I really dug into, and I love it! Modeling computations in terms of expressions is great. If anything, I would be happy to see more functional features, like maybe making various kinds of effects encoded into the type system (not like it all has to be in monads, just a little tip that you’re using a function which isn’t referentially transparent). Some days I want modular implicits, and other days I feel like functors are better from a maintenance perspective. Could be cool if there was an annotation or something to tell OCaml to do type specialization reified type variables on certain code paths for speeding up some things (floats), but I also understand this is kind of challenging to reconcile with the uniform object representation that makes OCaml’s garbage collector so awesome and compiler so fast.

I would also say that I find OCaml extraordinarily readable, and this almost (but not quite) makes up for the lack of good documentation for most third-party libraries. Because I the documentation situation is so miserable, I’ve often resorted to reading code and been pleasantly surprised how easy it is.

Started learning OCaml with the old horrible website and this wasn’t very good, but quickly switched to RWO. I just got set up with Dune, Merlin and Base, and I’ve been pretty happy with those things. Don’t get too many inscrutable compiler errors because Merlin shows them up in my editor. If they types are too nested and opaque, I just write some aliases and annotate functions with them so I don’t have to see all the crazy nested stuff and get readable types in my errors.

Base is great for me. It goes more in the direction of labelled arguments and result types instead of exceptions, and this is exactly what I want. I don’t always use it for small projects, but it’s pretty awesome in a larger codebase.

I do wish the standard library was better. It’s bad that the only real way to manipulate strings is Str and everyone immediately tells you not to use it. Re is nice once you know about it.

Not totally satisfied with the pretty printing situation of records, but at least there is this sexp stuff with Jane Street stuff.

I’m in general very happy with the prevalence of s-expressions in Jane Street tooling. It’s such a blindingly obvious and simple way to format structured data, and for some reason the rest of the world is chasing after JSON and XML. JSON is fine. XML is whatever. S-expressions are just on another level of simplicity and elegance. Love that.

OCaml’s ecosystem isn’t perfect. Sometimes I don’t know wtf is going on with Dune and Opam, but then I just ask and someone helps me. Documentation is terrible way too much of the time. Can’t always get the library for the thing you want to do, etc.

Still, this beginner had a very different experience from the OP. I <3 OCaml.

23 Likes

Interesting. I find the lack of early return (in combination with the result monad) to be pretty troublesome. And for the record, I’m a big anti-monad person; nevertheless, I find the result monad to be pretty useful.