What are the biggest reasons newcomers give up on OCaml?

mimoo · December 14, 2022, 2:11am

That’s probably not a good question to ask here, as most people here are surely people who persisted and made it (or are on the way there). But I’m still wondering what prevents OCaml from being more widely adopted as a language, and more, what made people who could have benefited from (and contributed to) OCaml give up.

My guess would be these three, in order of importance:

The compiler errors. Personally I think this is the biggest hurdle to get over with when learning OCaml. I wanted to give up many times due to this and had to spend way too much time understanding how to parse them. Even today, I tend to copy/paste and add spacing here and there so that my eyes can go through them, and I still find them unhelpful to deal with a number of errors.
The number of resources in haskell, and the lack of good documentation. Most time when I google about concepts or libraries that I run into in OCaml, I don’t find much. And if I find something, it’ll probably be something about Haskell. There gotta be a number of people who take this as an opportunity to switch language and learn Haskell instead.
The tooling. Compared to cargo/go, the combination of opam and dune is extremely hard to use, doesn’t have good documentation, and feels cumbersome (probably because their responsibilities overlap). It very much feels like I’m dealing with Makefiles most of the time. I think the lack of convention in OCaml land probably stops a number of beginners.

What do you think are the top reasons?

jfeser · December 14, 2022, 3:23pm

I remember debugging being a serious pain when I first started. Specifically, the lack of a polymorphic print function is a big issue. Many (most?) new ocaml users come from languages where adding print statements is the easiest way to debug a program. If I could have one new language feature to make life easier for beginners, that would be it.

Otherwise, I think the tooling has made huge strides since I started with the language (~2013). Dune is generally very good. I love being able to just clone dependencies into my repo and start hacking on them. Opam is just fine for new users. We have a great lsp server and really polished vscode support. Debugger support is still a pain point, but we have a real wealth of tracing tools.

Chet_Murthy · December 14, 2022, 4:13pm

Two thoughts:

Rust doesn’t have one either
BUT Rust has traits/modular-implicits, which make … “composing” such a print-function pretty easy.

pat · December 14, 2022, 5:14pm

I think it depends on the type of “newcomer”.

If you mean new to programming in general, probably a ML is not the easiest to pick up, compared to something like python. Theres many concepts that are harder to grasp than it is in python. Also terminology can be off-putting (applicatives, monads, functors etc).

Another “newcomer” might be someone who has done lots of web programming, as its an easy start for many developers. Here, almost every app is built on the fact that http is involved, meaning you want to have a server, and most likely a database. Here i see ocaml lacking a bit. There is not a “goto” http server library, and documentation is not as good as any other popular web language might have. Also the database side is tricky. Theres not a goto library for this.

On the ocaml-on-the-web i sometimes wonder if you could build a new library on-top of 5.0 effects, and completely remove the need for monads. Time will tell. The bottom line is both the server and the database library needs to be async, and ocaml having two competing libraries for this does not help the newcomer.

jfeser · December 14, 2022, 9:24pm

That’s true, but rust has the print! macro. I don’t think print is passed as an argument often, so a macro is just fine.

You can sort of get this in ocaml with ppx, but it requires type annotations, and you have to install and set up the ppx. That isn’t terrible, and using ppxes has gotten much easier over time, but it’s a UX hurdle that I wish we didn’t have.

Particularly when working in a classroom setting, it lowers friction a lot if you only have to install the compiler.

lepoetemaudit · December 15, 2022, 10:26am

Yes! I’ve been playing with eio-cohttp and it’s quite wonderful. So much so I started making an experimental ground-up library for Postgres using eio and it’s been a very pleasant experience so far. I would love the community to rally around such efforts and bless effects-based libraries when the time comes.

Eio even gives proper stack traces from exceptions, something that seems to be swallowed up all the time by Lwt despite trying to be disciplined in how I use it, which is a major help in debugging async code.

Khady · December 15, 2022, 10:34am

It’s actually the opposite. cargo/go are doing multiple things at the same time. While opam and dune cover different things

opam does package management (similar to apt/yum/npm/pip)
dune is a build system (like CMake or meson in C/C++ and webpack in the js world)

So the integration between the two is not really tight and not as smooth as what cargo can do

mimoo · December 15, 2022, 10:50am

By that I meant I need to declare dependencies in both opam and dune file, I think it gets complicated if its transitive dependencies or dev deps, and the naming is different (dune has mod.submod, opam will use the public name, and ocaml will use Mod_submod)

dbuenzli · December 15, 2022, 8:53pm

Indeed, what you need to understand is that there are four kind of names:

opam package names (e.g. ocaml)
ocamlfind packages names (what dune calls library names, e.g. compiler-libs.common)
OCaml library archive names (e.g. ocamlcommon.cm[x]a)
OCaml module names (more precisely compilation unit names, derived from source files)

None of these names need to coincide or be included in each other and there is no one-to-one mapping. An opam package name represents a set of ocamlfind packages. An ocamlfind package represent library archives to link (but nowadays are mostly one-to-one with library archives) and an archive name represents a set of modules: those that are contained therein.

Yes, it’s an embarrassing mess.

I once spent a significant amount of energy to try to clear it with a complete compability story here. But there is no interest. Upstream doesn’t seem to care^[1], and dune people seem happy to live in their own bubble.

Which is rather unfortunate. Leaving the dependency and linking model up for interpretation is not such a good idea. Equipped with these in the compiler there are quite a few places where the user experience can be improved. ↩︎

BikalGurung · December 16, 2022, 9:39am

Indeed, I was closely following the development and was looking forward to it being landed in the upstream. But alas not to be. This would have greatly simplified the OCaml artifact names I think. Additionally, I think it would have opened the possibility of the ocamlc /ocamlc.opt binary being as good as dune to build OCaml artefacts.

Are there any counter proposals in public with regards to that rfc?

dbuenzli · December 16, 2022, 1:03pm

Not that I’m aware of, the status quo is likely to be political as well. There is the namespace proposal but when we were discussing the RFC the library linking proposal was rather seen as a first step towards it. Other than that this RFC brings a few clarifications on the underspecified compilation and dependency model but it doesn’t solve any problems on the current conceptual mess.

Leonidas · December 16, 2022, 1:29pm

It’s even better, it has deriving built-in (something I’ve been wanting in OCaml since… years), thus allowing the compiler to just derive the Debug trait, which includes printers (while not doing anything in release builds I assume, so without sacrificing the generated code size). So while the initial experience of not being able to print a random value with {} is reminiscent of OCaml, using {:?} and deriving Debug was pretty straightforward. Contrast this to %a and either having to depend on ppx_deriving_show or composing a debug printer using Fmt.Dump.*, in practice this is much more ergonomic on the Rust side because it just works (also due to the compiler telling you to derive Debug in the error message).

I’m not likely to switch to Rust soon but revisiting Rust 1.65 after my last foray into it at version 0.9 it was full of genuinely nice ideas that made onboarding on a rather complex language quite pleasant.

jbeckford · December 16, 2022, 2:38pm

I’m teaching OCaml to a few high school students who only have a senior (“AP”) Java background. I’m not re-inventing the wheel; much of the teaching material comes from the Cornell CS3110 course. And a couple students are interning with me, so I’m highly motivated to track down the newcomer problem spots. So far the problems have been:

Explaining the need for eval $(opam env) or the equivalent on Windows.
How to fix Unbound module XXX. Which is related to …
What (libraries xxx) corresponds to what Opam package.
Almost every time they wrote a dune file something went wrong. This has been “fixed” so far by writing a tiny, task-oriented intro to Dune. Which is related to …
Almost all of the documentation for OCaml libraries is API-level documentation. For example, the regular expression re library has very good API-level documentation, and it has some examples. But the quantity and comprehensiveness of the examples aren’t sufficient to help a newcomer. So I’ll be teaching a tribal-knowledge trick: looking at a library’s unit tests in its source code. If only test code was automatically included in the documentation!

I’m sure at some point we’ll run into some Windows OCaml problems as well (every high school student in my Seattle metropolitan area uses Windows for their home PC).

In contrast to all the above environmental problems, the OCaml language itself has been relatively easy to understand (so far!). I think that is because we can write substantial programs in OCaml without having to know much advanced OCaml (honestly, how often do we need first-class modules and GADTs?). That is in contrast to other languages like Rust where advanced understanding is needed for even trivial programs.

Ulugbek · December 16, 2022, 3:24pm

Absolutely agree on all five points, looking back at when I was starting out with ocaml. I still can’t solve point 3 without opening the lib’s repo and looking into dune files.

upd: I was suggested on discord that running dune installed-libraries (or ocamlfind list) and grepping can help with that

gasche · December 16, 2022, 4:56pm

Replying to the original list.

Error messages: my guess would be that this depends a lot on the background of the student. My experience with students without any programming background is that they don’t real error messages at all – in any language – they are just interested in the source location. For people that already have programming experience, it probably depends on what their previous experience was. @mimoo is familiar with Rust that has excellent errors, so he is easily disappointed, but until fairly recently C or Java compilers also had bad error messages (Clang was a bit help in moving the status quo).
The OCaml compiler errors have also improved a lot. Syntax error messages are still disappointing. For typing errors, my feeling is that the bad errors come with the most complex language features, and that some projects have gone overboard with these features in a way that hurts usability, but for mundane programming I find the errors in fact reasonable.
Summary; I certainly agree that error messages are important and I’ve worked on improving them, but I would not list them as a “main reason” to give up on the language.
Number of resources. Maybe, and this is an issue that is hard to solve by itself. I wish the OCaml community had stronger communication habits, people writing blog posts with cute examples of using their favorite library and what not. I’m not sure how to improve this – besides generally making the language better so that more people join naturally.
Tooling.
Personally I don’t have a problem with opam files, which I find relatively straightforward, or opam usage that is relatively well-documented in my experience. On the other hand, I have come to dislike the global-switch workflow: I think we should have one local switch for each development project, with caching to make this pleasant. (Yes, just like esy.) @dra27’s work on relocatability is a major step in doing this “the right way”, and I wish other people also considered it a high-priority project
I agree that the Dune documentation is disappointing (it’s just hard to find information there), and I think that the tool would need more usability work. On the other hand, I would rather expect beginners to start with relatively simple Dune files, and those are okay. Onboarding OCaml newcomers directly with a large, complex project is probably more difficult, but I never had this experience so I cannot tell.

Among the other things mentioned in the thread, I agree that deriving would be nice, or… maybe it’s possible to add enough runtime type information in values that we can do a decent job of printing them for debugging. (I’m told that @let-def worked on something similar at some point.) The problem with deriving is that it’s fairly difficult to specify in a satisfying way. (It either feels like a simplistic ad-hoc solution or a complexity monster.) Modular implicits would help, but right now no one is actively working on them so we just have to be patient and look for volunteers I guess.

jhw · December 16, 2022, 5:05pm

I’m already on record in support of the observations @dbuenzli makes about the conceptual mess entailed by having both opam list and ocamlfind list produce mostly related (except where they are not) lists of conceptually different things called “packages”…

But @jbeckford points at another thing that I feel makes life difficult for OCaml users, the available unit test frameworks. The best one is probably Alcotest, but it’s not great, and I haven’t adopted it myself yet. Still using OUnit, which I also do not love very much.

At my day job, we work mainly in C++ and the available unit test frameworks there are, well, better. My favorite is Catch2, but even the more popular one from G* has substantially better ergonomics than all of the available frameworks for OCaml.

I’m not sure what it is about OCaml that makes it hard to design a good unit test framework. Lack of good hygienic macros in the core compiler toolchain and standard library? No language support for polymorphic overloading? Combination of both? Other factors? No idea.

But if we’re collecting reasons for people to grind their teeth on encountering OCaml, I’d have to put the unit test frameworks near the top of the list.

Jon_Harrop · December 16, 2022, 5:55pm

Multiple personalities. Bytecode or Native code? Batteries or Base? Lwt or Async?
Dead packages. Many (most?) Opam packages appear to be dead but there’s no way to tell which are thriving and which are dead. For example, lablgtk3’s own entry-level sample dies with a seg fault on Mac OS, lablgtk2 installs via Opam but isn’t recognised as a package by Dune, labltk’s canvas demo produces a blank window instead of the shapes it is supposed to.
Syntax. All of this weird [%%deriving show] stuff is fine for experts but not ideal for newcomers trying to print some value of some type for debugging purposes.
Installation is a hurdle for newcomers. Why have errors like Error: Library "cohttp-lwt-unix" not found when you can just install the library for the user? Took me an hour to get a barebones setup working with VSCode this week. If a website offered an OCaml IDE with persistent store you could just log on and start working. Better yet, integrate HTML, graphing and charting and we can have a nice technical computing environment.
Lack of core functionality either entirely (like generic printing, generic sets and maps) or out-of-the-box (read the lines in a file). Stdlib improvements would help a bit but polymorphic equality, comparison and hashing are fundamentally broken and functors are a heavy solution.
Opam and Dune could be easier to use. Why must I opam init by hand? Why must I repeatedly update my terminal’s opam environment with eval $(opam env) by hand? Why does Dune create a behemoth multidirectory bin+lib scaffold by default when 99% of the time I just want ‘Hello world’?
Gotchas. I just tried extending the String module in a vanilla dune project and got circular module references that I couldn’t figure out how to fix so I gave up. I used to do that all the time.

As an exercise, consider a pseudocode program like:

data = [[1,2,3],[4,5,6],[7,8,9]]
print data

How is this written in OCaml? First you have to install a package to get a second-rate print but there’s no GUI so you use the CLI:

opam install ppx_deriving

Then you have to figure out the necessary incantation in another syntax that requires a specific structure for which there is no editor tooling or GUI:

(executable
 (public_name foo)
 (name foo)
 (preprocess (pps ppx_deriving.show ppx_deriving.ord))
 (libraries ))

Now we’re ready to start coding but we must remember OCaml uses ; separators instead of , like everyone else and we can use in or ;; for entirely unclear reasons:

let data = [[1;2;3];[4;5;6];[7;8;9]] in

Now we find the function to print is in another module, we must add \n ourselves, don’t forget to flush with %! and we’ll need to Google the insane inline PPX syntax required to print something:

Printf.printf "%s\n%!" ([%derive.show: int list list] data)

I don’t want to create a big temporary string though so I do:

let () = Format.printf "%a\n%!" [%derive.pp: int list list] data

But that doesn’t work and I cannot figure out how to fix it.

alan · December 16, 2022, 6:16pm

The OCaml toplevel will print the results of expressions, so the printing functionality already seems to be possible. I wonder what’s preventing a polymorphic “show” function that converts a value to how it would appear as a toplevel result?

octachron · December 16, 2022, 6:33pm

Types don’t exist at runtime in OCaml. A function that prints a value according to its type will never be possible in OCaml by design. The REPL is a different matter since the REPL can inspect the environment of the code it is executing and use the type information it has available at hand to implement some basic printing.

alan · December 16, 2022, 6:36pm

Ah, silly me. I forgot that types are erased, so there’s no way of distinguishing between [], 0, and None, or { x = 0 } and Some 0…

Topic		Replies	Views
In praise of ocaml Community	13	3040	December 8, 2023
The OCaml Platform - a vehement dissent Ecosystem build , dune	44	2868	September 20, 2023
A Roadmap for the OCaml Platform - Seeking Your Feedback Community announce	70	7716	November 21, 2023
Defining standard OCaml development lifecycle processes Learning	32	4357	April 19, 2021
"OCaml -- first impressions" Learning user-feedback	146	15419	October 26, 2018

What are the biggest reasons newcomers give up on OCaml?

Related topics