I’ve been forced to work with OCaml and have slowly come up with a growing list of things I dislike about OCaml. I don’t know how I feel about posting this, it feels like a harsh critic, and I’m sure people will disagree with many of my points, but perhaps it is good to share so that the community can see where things can be improved? Forgive me in advance but here it goes:
First Steps
opam and dune. The accepted ways to manage and build your project and its dependencies is to use two different tools: opam and dune. Both of these tools are a nightmare to use. They sort of overlap and you will often wonder what is being used under the hood when you build something (opam, dune, or both?). They will have you create a number of files in a lisp-like language (without extension so that your IDE cannot format or syntax highlight them), have horrible documentation (I cry when I think of the hours I spent reading the dune documentation or the man pages for opam), and in general don’t really work as intended.
Which standard library. Right off the bat you’ll run into a number of standard library augmentations and alternatives to choose from. The ecosystem is extremely fragmented Apparently the standard library itself is lacking, using old ways, or worse poorly documented. It seems like maybe 50% of developers use Jane Street’s set of libraries instead of OCaml’s official stdlib (which is also confusing, uses outdated OCaml, and is poorly documented as well).
Doing simple things. Want to decode an hexadecimal string? Encrypt something with AES-GCM? Generate a cryptographically-secure random number? Split a string? Manipulate a bytearray? Create a folder on any platform? Oh boy, everything is hard in OCaml and you’ll often have to write these things yourself or rely on some C libraries with bindings.
Compiler errors are useless. This is one of the thing that’s often cited as what made people move to Rust. OCaml learn from that! Fix your damn compiler errors. They are meaningless at the moment and people end up giving up the language as soon as they get stuck on them.
Lack of documentation. The official way to generate HTML documentation is to use ocamldoc. Let me tell you, it looks ugly, and is hard to navigate. Worse, it’s often really hard to understand where the documentation for your dependencies live. That is, if it lives anywhere, or if there’s any documentation to begin with. Most projects in OCaml have 0 examples, and close to no documentation. Let me give you a challenge. Try to use a hashtable with the Jane Street libraries, or try to use channels with the stdlib.
Lack of comments. This is somewhat related to the previous point, but here I’m talking about reading code. Commenting your code is not a thing in OCaml. You’re most often supposed to read interface files (.mli) and guess what the functions do by looking at the types they take and the types they return. As the language is very expressive, it can be hard to follow some code as types are generic by default.
Lack of resources. There’s honestly not that many resources to learn OCaml. A lot of them are university courses that talk too much about the theory. There’s actually a single good resource and it’s Real World OCaml. If you’re looking for resources for advanced concepts like GADTs, you’ll basically run into articles that are written for Haskell or another language. At this point it feels like you need to learn haskell to understand OCaml.
opam & dune. There’s really a lot to say here. First, you constantly need to manually update opam. Sometimes, opam will fail for no reason. If it fails in CI, good luck trying to solve this. If you are trying to do something specific, the documentation of opam and dune are atrocious. It feels like every week people are having issues building the project. Starting from scratch is a nightmare, as there’s no good way to list your dependencies and their versions. If you want to install a new dependency, it might brick your whole setup as it might change the versions of already installed dependencies. If you want to use a tool that requires the latest version of OCaml but your local switch doesn’t have the latest version of OCaml, you’re also screwed. If you want to publish something, it’ll have to get manually approved (each releases) and it’s a painful process (even if some unstable tools exist like dune-release and opam-publish). The last time I tried to publish something there, it wasn’t working, until I asked around and people realized that the opam repository had been stucked for 10 days (and nobody else had noticed until now). Which meant I couldn’t even publish my package until someone fixed it manually.
esy. Esy is an alternative package manager that provides lock files, get rids of switches, allows you to explicitely list your version of ocaml, the version of all your dependencies, as well as pins (patched dependencies). Having said that, I always run into millions of bugs and haven’t been able to use it on a complex project.
module names are not enforced by convention. They are enforced by two fields: name and public_name in a dune file. You can import these dependencies using both names (unless you’re importing from outside thr project, in which case you can only use the public_name field). This is dumb, as you can have two completely different name and use them interchangeably. Even dumber, these names are not enforced by your directory name (in rust something in a/b can be imported by doing a::b), so you really have no idea about what you’re important, or where it is located, unless you search for a dune file with that name or public_name. Furthermore, the public_name can be lib.sublib, which will be installed by opam as lib and available to dune as lib.sublib. This is great but really it’d been better if this was enforced by a directory structure, as this lib.sublib can now live outside of the lib…
Modules can be used as namespaces, but also as arguments to functors (basically functions that can create a module from a list of modules). It becomes a shit show when you read code because you never know if you’re reading one or the other. A module might be created just to pass arguments to a functor… it’s so confusing.
Types and variables are both using snake_case, so it’s impossible to tell them apart when reading code.
Since OCaml is a scripting language, you can’t refer to something that’s written after what you’re writing, so recursive types are a HUGE pain to write (types that reference one another). Worse, if you try to split your code in modules, two modules can’t mutually reference each other, it’s going to be a dependency cycle. So basically, eventhough you can create several files in a project in OCaml, you have to treat all these modules like third party libraries. It’s a huge headache
You have to think recursion all the time, which is very unnatural. Even with functions that are not marked as rec. List.fold for example is about recursion.
Writing loops is so convoluted, unreadable compared to imperative code. They could have use some syntaxic suggar perhaps…
Modules are badly designed as well, if you import a module it’ll become available to every module you import, so you can’t import two different dependencies that import two different modules with the same name, or that import the same module with different versions
Typing is not great… everything feels generic and so it is hard to follow code when you’re always wondering what type something is
Opens are abused (open X). These pollute the scope and potentially shadow other modules and functions, and you also have no idea what you’re getting. In rust the equivalent is use X::* which is frowned upon for the same reasons, and use X::x is prefer which imports only what you need and is explicit. OCaml doesn’t have that so people just open stuff all the time.
You can’t return early in a function. So if you have a loop that checks every element of a list or an array or a string or bytes and something’s wrong you can’t return early (for example). So basically, everything has to be done recursively if you want to write idiomatic OCaml (and if you want to avoid fighting with OCaml). Everything you write end up being weird, non-natural, contrived, etc. It’s fine when you’re writing code, but I pity the people who will have to read it… there’s just no way to make it easier to understand. It’s like someone thought it would be a good idea to make programming more like math, and make it as hard to read code as to read a math paper. You’re just reading equations.
ppx (the macro language) breaks every time there is an OCaml update because it relies on internal compiler assumptions
It’s a pain to handle option and result types in OCaml, let syntax is ugly and painful, rust handles these much more easily with if let and ?
Prder of what you write matters, so if you create a function called a, and you write another function called a after, it’ll shadow the first one from that point on. It’s dumb. This is because OCaml evolved from a scripting language.
A friend once said “Ha! We used to, in fact GADTs is an awesome way to end up with code nobody understands. We still use OCaml for the application of logical cryptography rules, but we try to stick to simpler stuff unless it’s absolutely necessary !”
Interface files suck, you always have two files for modules and have to maintain both of them, and comments are supposed to only be in one of them so you’re always switching back and forth if you’re trying to understand a library
Of course the whole “we don’t have parenthesis for functions” make functions super hard to distinguish from the rest, and it also doesn’t look well when you’re forced to add a unit argument, or:
there is no nothing of method, so everything is a module with a type called t and a bunch of methods that take t as argument. Who thought this was readable?
in general formatting and indentation gets really really bad. There’s so many examples of unreadable code due to formatting, I’m guessing every projects must try to customize ocamlformat instead of using the defaults. This must be the most infuriating issue with OCaml I have.
comments are opening and closing (* … *), its’ such a hassle to write a comment since you have to close them as well… Every language has comments like // but OCaml doesn’t.
ppx syntax: don’t get me started
reasonML pretty much divided the community in two…
Sorry for spreading these across multiple posts, looks like I initially had some errors with posting more than two links as a new account. And then kept going. Anyway…
The bottom line is that, OCaml is really not friendly to beginners. It’s syntax is something I can accept, but at least it should have good compiler errors, friendly tooling to build and manage dependencies (like cargo or go), strong default formatting, a good standard library, a good culture around comments, and strong resources to learn the language. I think OCaml has none of that at the moment. It feels more like an academic language for people who like language theory, than a language that can be used to engineer real-world applications.
Well, that’s certainly a long list of criticism. I think I agree with a lot of them, even though I like OCaml and program in it daily. I’ve personally tried to address or raise some of these points (early return, better loops, hex functions, opam’s way of working, etc.).
I think there’s also some good in there that it’s good to remember: the type checker and pattern matching systems are incredibly good and robust, the language is simple and predictable once you get past the syntax, the libraries are generally good, and dune is quite powerful and fast. Documentation has made huge progress with odoc, and I think some of us like to write comments or even documentation in their .mli files. But clearly if you begin in OCaml after using rust or the likes, it’s clearly a rough time.
I have little hope that the stdlib will grow to contain the things you missed in it, but some other points might be more actionable. I hope.
We now proceed to ML as it existed in the interactive LCF theorem proving system, where it served as the metalanguage (i.e., proof scripting language) and command language…A major benefit of this typing algorithm was that it did not require the programmer to explicitly specify types…This feature preserves some of the simplicity and fluidity of dynamically typed languages like Lisp and POP-2, and was particularly important for LCF/ML’s use as a scripting and command language.
It’s very hard to address this kind of post. It should probably be titled “Difficulties of coming to OCaml from Rust”. Rust gave up on functional syntax (like OCaml, F# and Haskell) and switched to C++/JS syntax deliberately to try and attract systems programmers, leveraging knowledge of prior syntax.
Some of the things you mentioned seem like valid criticisms, but others just seem like things that are different in OCaml – not necessarily worse. One thing I completely agree with is that documentation is far from where it needs to be for beginners. This is especially bad given the fact that OCaml has evolved to have many different things than what is considered the norm, such as dune’s lisp syntax (which I still think is a needless shot to the foot). The fact that Real World OCaml uses alternative Jane Street libraries is IMO a huge issue as well, which is why I try to direct people to the Cornell book instead.
your essay reminds me a lot of the sentiment I had when leaving from Enterprise Java and starting Objective C 15 years ago. The APIs felt stupid, topics mixed, undocumented (lacking explanation what things did), a SDK without test or doc tools, malloc/free - the list goes on.
Then I heard a Mac greybeard talk about embracing unavoidable idioms and coding ‘along the api, not against’. That clicked. I swallowed my prejudices, accepted things and came to like it quite a bit.
When in Rome, do as the romans do. It won’t be all greek philosophy, but you may get along.
Just two points of you complaint: opam and stdlib.
opam
IMO it’s is different tasks to manage dependencies and to build software. So to me it’s natural there is two distinct tools. And the dependency is one-way, The package manager needs to build but not vice versa. However, dune once used to tell about unmet dependencies in an actionable manner. Sadly that got removed. So room for improvement, well, sure, but unusable? No.
stdlib
OCaml is rather old and so there are many en-vogue topics younger than the stdlib. Those start separate. Look at c++ and boost. The old age (and careful backwards compatibility) again pleases the long-time user and annoys the rookie or student. Learning and taking up habits might not pay until you leave again.
So the aspects that drove you crazy stem from the very same reasons why many in the community love OCaml. And functional (recursion all over the place, immutability) indeed is different from imperative (iteration, mutable state).
There’s a saying that functional makes hard things easy and simple things near impossible.
allow me a personal side note: your username is an insult to anybody who wants to take you serious. That’s not how adults interact and harms your cause.
Hi!
First, I have to say that I agree with some of your objections against OCaml, like the lack of packages, documentation and tutorials. Things are getting better with time but the language still has a long path before reaching a state more accessible for beginners. Also agree with some choices in the design of the language (but it is a personal taste). However, it feels like so of the things you are addressing are not against OCaml but functional programming in general.
For the language part:
Modules can be used as namespaces, but also as arguments to functors (basically functions that can create a module from a list of modules). It becomes a shit show when you read code because you never know if you’re reading one or the other. A module might be created just to pass arguments to a functor… it’s so confusing.
Types and variables are both using snake_case, so it’s impossible to tell them apart when reading code.
IHMO, that’s a strength of OCaml to have a minimal syntax but it’s like broccoli, it’s a question of taste as it’s different for every person.
Since OCaml is a scripting language, you can’t refer to something that’s written after what you’re writing, so recursive types are a HUGE pain to write (types that reference one another). Worse, if you try to split your code in modules, two modules can’t mutually reference each other, it’s going to be a dependency cycle. So basically, eventhough you can create several files in a project in OCaml, you have to treat all these modules like third party libraries. It’s a huge headache
You have recursive modules even if it is limited (cf manual) alongside the and. Some find this clearer as you have to “bind” the declaration together, some not.
You have to think recursion all the time, which is very unnatural. Even with functions that are not marked as rec. List.fold for example is about recursion
Writing loops is so convoluted, unreadable compared to imperative code. They could have use some syntaxic suggar perhaps…
Welcome to the functional programming world
Opens are abused (open X). These pollute the scope and potentially shadow other modules and functions, and you also have no idea what you’re getting. In rust the equivalent is use X::* which is frowned upon for the same reasons, and use X::x is prefer which imports only what you need and is explicit. OCaml doesn’t have that so people just open stuff all the time.
Agree here, even if you can limit the scope of the open by using modules as namespaces. Another option is to alias the module with a letter (module M = My_Module).
You can’t return early in a function. So if you have a loop that checks every element of a list or an array or a string or bytes and something’s wrong you can’t return early (for example). So basically, everything has to be done recursively if you want to write idiomatic OCaml (and if you want to avoid fighting with OCaml). Everything you write end up being weird, non-natural, contrived, etc. It’s fine when you’re writing code, but I pity the people who will have to read it… there’s just no way to make it easier to understand. It’s like someone thought it would be a good idea to make programming more like math, and make it as hard to read code as to read a math paper. You’re just reading equations.
Programmation and maths are closed together. You can still return early in a function by raising an exception you then catch locally. And yeah, if you are not used to it is hard to read, sure. But like in every language, there are drawbacks. Don’t tell me it was easy to read rust lifetime the first time you read it (as bellow) ?
impl<'a> Foo<'a> for Bar {
type Item = &'a PathBuf;
type Iter = std::slice::Iter<'a, PathBuf>;
fn get(&'a self) -> Self::Iter {
self.v.iter()
}
}
It’s a pain to handle option and result types in OCaml, let syntax is ugly and painful, rust handles these much more easily with if let and ?
Still a question of taste: in Rust, the function expresses the result next to the call whereas in ocaml it’s expressed near the variable!
let ratio = div(x, y)?;
let ln = ln(ratio)?;
sqrt(ln)
against
let* ratio = div x y in
let* ln = ln ratio in
sqrt ln
Interface files suck, you always have two files for modules and have to maintain both of them, and comments are supposed to only be in one of them so you’re always switching back and forth if you’re trying to understand a library
As before, it’s a question of taste, either you want to mix documentation with the code or you want to avoid mixing the logic with the interface description (as C does with h files). It’s a design choice, and I understand it can’t fit everyone’s needs but that’s how it is. If you want to write everything in one file, nothing forbids you to write:
module X : sig
(* val y : .... *)
end = struct
(* let y = ... *)
end
or
let inc : int -> int = fun x -> x+ 1
(* or* *)
let inc (x : int) : int = x + 1
Of course, OCaml is hard to start as a beginner as it lacks resources, but things are getting better and I feel that people want the language and the community to be more accessible. However, the process takes time and some need to struggle for the greater good (by writing documentation, tutorials, packages, answering on this forum, etc).
On opam / dune interaction and reliability, we have indeed fallen behind here (to put it mildly) compared with other language ecosystems. It’s not something that’s quick to fix, but both tools are looking at plans to sort out the integration. Historically, I think it’s fair to say that we ended up here because opam is older and Dune (Jbuilder initially) started from a mono-repo.
ppx does indeed change with each compiler release, but the release processes we have in place have meant for the last 4 or releases the package ecosystem has been ready by at least the release candidate (and often earlier). There is a roadmap for work to ease the migration of the ppxes themselves between releases, but for users of the syntaxes (as opposed to authors of the ppxes themselves) I think the story has become quite reasonable.
The incident with opam-repository’s web deployment earlier in the year was rather unfortunate, but that deployment has been completely overhauled (see the ocaml.org deployer) so at least that shouldn’t happen again!
And lot of people here have given valid points but I think something was missing.
I think OP does not realize that a lot of what he would like to have would be loved by the OCaml community, its just that they are very hard things to do, and OCaml does not have the same resources as, for instance, Rust.
I think the error messages are probably a good example of that. Everyone would agree that it would be better to have nicer errors in the compiler, its just a hard thing to do, and not a priority currently (as far as I know), because of the limited resources.
Thanks a lot for this post @throwaway! I think your points represent pretty much every aspect that beginners to the language struggle with when they get started, and it is great to have them all summarized in a single post.
I mostly agree with many of your criticisms (in the sense that I can see why each point can be frustrating to a beginner), but some random comments follow anyway.
Cheers!
Nicolas
Agree on the fragmentation part, but am not sure I follow your criticisms of being “confusing” and “poorly documented”, can you elaborate?
This is a common complain about languages that feature pervasive type inference. Having said that, I can attest that if you spend some time learning about how type inference works, the errors end up making sense.
Types and variables can never appear in the same place, so this is not a problem in practice.
What makes you think that? In fact modern OCaml has been developed with a “compiler-first” design, see https://caml.inria.fr/pub/papers/xleroy-zinc.pdf, page 20 (“Toplevels considered harmful”) for some of the design rationale.
This is typically considered an advantage as it make it much simpler to reason about large codebases. It can somtimes be inconvenient, but the upside outweights the downside almost always.
This is a feature of all functional programming languages. It just takes some getting used to.
OCaml has for and while loops that are excellent for writing imperative code; have you tried them?
The fact that module names occupy a global namespace is a long-standing limitation, but can be worked around by using “module aliases” (as used by Dune’s wrapped libraries).
PPX is not a macro language, it is an AST->AST preprocessor, so there isn’t a lot that can be done about this point. Having said that, a lot of work is being done to make it easier to deal with compiler upgrades, but I guess that it still has way to go.
What behaviour would you like instead?
Seems like a reasonable opinion, but am not sure I understand what the criticism is here.
This is a often-heard criticism, but the consensus among heavy users of the language, and in industrial contexts is that separate interface files are a very good thing, as it forces you to think about the API exposed by each module. And comments are supposed to be in the interface file anyway, so you shouldn’t need to switch back and forth so much.
Among other things, this is because there is a difference between a function taking n arguments and a function taking an n-tuple (which uses parentheses). In any case, the OCaml notation (which is also used in all other ML-type languages) becomes very natural as you spend more time with the language.
Actually there are only a few different style conventions that I have seen used. You can get a very pleasent experience by using ocp-indent (this is what I use as it is very lightweight) or ocamlformat if you want something with more bells and whistles.
On the contrary, I think OCaml is most successful today in real-world settings where large, industrial programs are developed. These users value OCaml’s advantages above all else and the occasional unfriendliness of the tooling ecosystem does not matter much.