Major OCaml pain points

I recently stumbled across Adam Chlipala’s comparison of OCaml and Standard ML. I’ve often found myself missing some features that Standard ML has, especially first-class constructors. I’m wondering what features (or lack thereof) of OCaml (mainly the language, but also tooling) do people find painful, but can’t be fixed without major backwards incompatibility?

If this has been beaten to a dead horse previously on this forum, please point me in the right direction.

I have a list that I keep meaning to write up. I’m not even sure they all have backward incompatibility problems:

  • allowing firstclass record field access, such as x |> .field

  • removing the rec keyword, just allow recursion

  • allowing access to any function/name in the file, without and gymnastic (including co-recursive functions)

  • allowing access to any function/name within other modules without cycle problems

  • an inline match keyword (aka if-let): if-let SomeConstructor x = y then ...

  • inline comments: `// dont require (* for everything *)

  • use the Haskell/Elm syntax for constructors MyConst 1 2 3 4 instead of MyConst(1,2,3,4)

I think most of these would make OCaml much more user-friendly (especially the cycle stuff, as the and workarounds don’t really work when you start getting modules involved), without changing what it means to write in OCaml.

Those wouldn’t work well with shadowing, also I like being explicit about recursion personally.

I’m not aware of any particular issue with those

This was considered on https://github.com/ocaml/ocaml/pull/671 but no agreement on a specific syntax could be reached (according to my reading of the discussion).

A related PR https://github.com/ocaml/ocaml/pull/9005. Looks stalled.

This specific syntax would have issues because x . field and x. field and x .field are all parsed the same as x.field (ie spaces around the . are ignored). Something like (.field) (short mention in the aforementioned constructor PR https://github.com/ocaml/ocaml/pull/9005#issuecomment-537436288) is more likely.

2 Likes

As said already, removing rec would break all existing OCaml code that relies on shadowing (in particular, 99% of my code). If the designers of OCaml could go back I think they’d rather have type rec as well (type nonrec is the retrocompatible fix to that initial omission).

4 Likes

Can you be more specific? I’m aware that you can shadow names at a module/file scope.

Around being explicit around recursion, there are upsides but it’s also very irritating, especially when combined with having to move things around your file to support the variable resolution. Single pass compilers haven’t been a thing for a very long time, and this is very annoying. I’ll note that Elm, which is a more recent ML where they have put significant effort into the user experience, supports calling functions anywhere in a file, though they still disallow cyclic modules.

a typical use of shadowing within an expression could look like that:

...
let x = 1 in
let x = f x in
let x = x + 2 in
…

this breaks with recursion by default.

OCaml does require you to write code in a topologically sorted order, which (I’m aware I’m biased by the habit) is quite nice to read.

4 Likes

And from what I hear contributes to fast compiles?

Well at least the fact that modules make a DAG makes it practical to have incremental compilation at the module level; each module is typechecked and compiled exactly once. I believe OCaml is even able to do link-time optimizations (with flambda switches).

1 Like

[throughout, I say “Caml”, not “Ocaml” because this is an inheritance from Caml and the designers of Caml, and I want to pay due respect to them.]

A long, long, long time ago, on a continent far, far away, I asked Pierre Weis why Caml(-Heavy, and by extension, Caml-Light) didn’t allow this. He patiently explained that this … “weakness” was a direct consequence of the Caml design principle of ALWAYS having principal type schemes. To wit, if a Caml program (in the core language, sure, not objects, modules, extensible variants, etc) fails to type-check, then adding type-coercions WILL NOT improve the situation.

This is a really, really important point (in Caml’s favor). the point one important selling point of ML typically is that you don’t have to write down types – the compiler infers them for you. The “principal type schemes” property guarantees that the compiler will always come up with a type that is better than the type you would have written (i.e., that your type can be derived from the type the compiler infers) UNLESS your program is ill-typed. And all of this, without type-annotations/coercions. A language with this property really delivers on the promise of “type inference” (comma maaaan).

SML was not such a language. When you write the expression

x.field1

in the presence of multiple record-types (say, “t1”, “t2”) that all enjoy a “field1” member, the compiler cannot infer a type for the above expression, but it can infer a type for the expression

(x : t1).field1

and (of course)

(x : t2).field1

I for one found that (again, long long ago on a continent far far away and ever since) the tradeoff was well-worth it. It lets you write code faster, because you never, ever, ever worry that your type-check failure is due to inadequate type-casting – b/c if it were, then you have to figure out which of the umpteen gazillion places where you didn’t decorate something with a type … needs a type-coercion.

That latter bit is … well, all I can say is, that’s nuts, that is.

P.S. I feel I must (again) note that this was all due to Pierre Weis’ patient and generous explanation. None of this (except for the experience report) is from me. Erm, well, except (of course) for any errors in my explanation.

6 Likes

Note that I’m in favor of keeping the let rec syntax, but your example is not pertinent. This code does not declare function, but only expressions, there is no ambiguity in

…
let x = 1 in
let x = f x in
…

The compiler is smart enough to know that recursion cannot be applied here :

let rec x = x +1;;
Error: This kind of expression is not allowed as right-hand side of `let rec'
  • Parallelism
  • First-class Windows support
  • Subtyping and function arguments polymorphism
  • Integer ranges

What do you mean by integer ranges ? There are many flavors of int in OCaml already (think of the Int32, Int64, and Z modules)

For pattern matching: https://github.com/ocaml/ocaml/issues/8504

1 Like

A good use case for function shadowing is described in Real World Ocaml, in chapter “Data Serialization with S-Expressions”, at the end of section “Preserving Invariants” (you can see it at https://dev.realworldocaml.org/data-serialization.html).

The idea is that you often extend a module (with preprocessing features like [@@deriving sexp] of a type, or with an include statement). Often, you want to shadow an extended function, as an example to preserve an invariant of your type (that could be broken with a [@@deriving sexp] to cite one). If you have recursive call by default, you cannot use the extended function to write your customized one, which is a pity because the default behavior of such a function can be reused in your new one. (so if we give it up, a lot of code breaks, and such is not in the philosophy of ocaml development as far as I know)

Also, the rec keyword helps the compiler to make optimizations, which is a good thing for performance use cases.

I think value we get out of the rec keyword is worth the pain of writing it :slight_smile:

1 Like

A typical use of shadowing is

let (* maybe rec *) f x = ...

let (* no rec *) f x = assert stuff; f x

or with values

let x = bla in 
let x = Some x in (* with -rectypes using rec would typecheck *)
...
1 Like

Thanks for your reply. I’m curious, is there a github request for “more subtyping and polymorphism” also ? As an OCaml user, I feel the need for more subtyping and polymorphism now and then, but at the same time I find myself unable to specify precisely the feature I want without breaking all the nice type inference and structure. It is not obvious how to arrange all the new polymorphic types we would like in one coherent whole so that any value still has a unique “principal” type.

Shadowing is ambiguous in general, for example consider those two perfectly valid definition :

let just_one =
  let x = [] in
  let x = 1 :: x in
  x

and

let many_one =
  let rec x = [] in
  let rec x = 1 :: x in
  x

Personally, I am not fond of the idea of pervasive implicit recursion everywhere, because I like code to have beginnings and endings when I read it. It might make some code more awkward to write, but the readability gain seems worth it.

3 Likes

People are talking about shadowing here, note that shadowing would still be possible if the function recursion default was switched, you would just have to create a binding with let nonrec ... instead of let .... [EDIT: I mean, presumably let nonrec ... would also be added as valid syntax at that point :-)]

I doubt this would work well in OCaml. It doesn’t in Haskell. In SML this works great since records are structurally typed, so field is roughly an equivalent to OCaml’s field : < field : 'a; .. > -> 'a

Namespaces. We have a flat namespace for compilation units that requires us to prefix each module with a unique name. Having a Common Lisp style namespaces (aka packages) is what we need in modern Reason/OCaml which is used with package managers such as opam or esy.

9 Likes