What are the biggest reasons newcomers give up on OCaml?

FWIW, from my perspective OCaml really made strides in usability from where things stood, say, 15 years ago. Camlp4 was a pain to get into for me, and the know-how wouldn’t stick (was still painful to use it on a second project). PPX is such a breeze in comparison.

But in the meantime I got spoiled by interactive debuggers for Python and TypeScript integrated into VSCode. The manner of debugging where you can watch human-readable representations of values of identifiers in scope as you step through the code execution line-by-line in the editor.

6 Likes

My own experience with Monads has been:

  1. they’re usually poorly defined because the OCaml philosophy seems to be that reading types is enough to understand how a monad is to be used.
  2. you can’t use them properly (via the let+ let* or let ppx) without understanding exactly what the code gets translated to (because otherwise you might introduce bugs, for example when mixing monadic code with imperative code and the monad mutates some global state). And it’s always a nightmare to remember exactly how all of these let%...in get composed into a succession of let _ in (fun x -> _).

So to me, as a non-theorist and mostly applied guy, it’s often used with this let syntax sugar that’s a very leaky abstraction that’s not pleasant to use (even if it seems to make code more readable).

4 Likes

That’s interesting, it seems that in the JavaScript world most devs have adapted to their equivalent async/await syntax? I won’t claim that they adapted perfectly without any bumps, but it seems to be the mainstream async programming style now. I’d argue that OCaml’s basic let-operators (let* and let+) are only slightly more complex than async/await and conceptually the same.

I don’t disagree at all with let* and let+ being only slightly more complex than async/await (and have been able to successfully use them as a newcomer with some Rust/Haskell/JavaScript/Python background). I think that, by virtue of their naming, however, they are less “discoverable” in the sense that transfer of intuition from other languages does not happen on its own (unless you have successfully used Haskell before, or have an intuition around using Monads from other sources). Monads are a more abstract concept than async/await and different people have different capabilities and experience regarding abstract reasoning.

Async and await are terms used throughout various programming languages and they refer to a specific kind of “non-blocking chainable computation”.

So when you have this basic understanding of what async/await means and how to use it in JavaScript, you have the ability to quickly figure out how to write code that uses async await in Rust or any other language that implements the same intuition around the async/await concept. Even if there are subtleties that you may not grasp yet, you have a good chance of figuring out quickly what you need to do.

The terms “async”, “await”, “Promise” lend themselves well for internet search via search engines or QA sites: you get reasonable search results relating to the specific type of “chaining computation” you want to do. Anything with operators and symbols is harder to search and find, and then you end up at a Monad tutorial (which may be exciting to some percentage of users and frustrating or overwhelming to the rest).

So while there is beauty in knowing that async/await can be represented (or implemented) as a Monad, a language having an “discoverable surface area that facilitates knowledge-transfer” can make onboarding practical-minded people who just want to get some work done or build a cool thing so much easier.

8 Likes

As always, the biggest issue with let+/let* compared to something native
to the language like async/await in Rust, JS, etc. is that they mesh
poorly with existing control flow structures. Async/await typically
works well with loops, exceptions, early return, conditionals, argument position,
etc. whereas let+/let* is more rigid.

6 Likes

Well they’re also more general than async. :slight_smile: Which is why I miss them in ReScript. :frowning:

1 Like

I’m currently searching for a simpler ML-like language.
Ocaml is a nice language and I’m frustrated because starting in my simple project requires a lot of learning and fixings.
I used C in a simple project I don’t remember about a similar pain.

1 Like

How much of that frustration is due to the language, and how much is due to the tooling? More specifically, how much is due to the difficulty of understanding how OCaml builds work?

I ask because my pet theory is that much of the problem for newcomers is the sheer opacity and complexity of the toolchain. It is similar to C (headers, sources) but a lot more complex. A common response to problems is “just use Dune”, but I personally think that is an anti-answer, not far from “you don’t need to know”. Some of us at least want to know exactly how our sources get transformed into running code; to me at least that’s an essential part of understanding “OCaml” - meaning not just the language in the abstract but all the mechanisms that make it work in practice. And that is woefully underdocumented.

4 Likes

The initial build and installation of opam, ocaml, etc. has not always been pain-free for me, and when there’s a problem, it’s pretty confusing at first. It’s possible that that’s part of what @Drito has in mind.

I strongly agree with this.

I started with a TypeScript + Rust project that I’m now converting large portions to OCaml. However, I ran into many dune, jsoo, js/dom interaction issues (as evidenced by my question history). If any of the issues were unresolved, I would have probably gone with ReScrip or stuck with TypeScript. It was only through the helpful responses of this forum that I got to a position where I feel productive in ocaml / dune / jsoo.

Without this forum, limited to only googling, I’d probably have quit quite early on.

In contrast, Rust/cargo & Scala/sbt were easy to get started in.

4 Likes

I think that this somewhat depends on what you mean by the “toolchain” and what you are using OCaml for. Casting my mind back, I don’t think I had any particular difficulties in building simple beginner-like projects using dune (or Makefiles with ocamlfind for that matter, save that as I recall it how dependencies work when linking up your project was not well documented). The problem, such as I had one, was in the main with the OCaml language itself, including its syntax and its type and module systems.

I don’t use OCaml for writing for the browser, and if that is your interest then using jsoo and interacting with the DOM adds an additional layer of complexity because the jsoo wrappers only take you so far and you end up writing code in a frankenstein-like hybrid mixing of OCaml and Javascript and this complicates amongst other things use of the LSP; but that issue seems to me to be about interacting with javascript as a backend rather than than the OCaml build chain itself.

I have not counted them up, but I think that this is reflected in your posts. I am also of the impression that you have adapted to OCaml unusually quickly. For the general case I suspect it is the documentation of OCaml and its libraries which needs to be improved the most and this seems to have been recognized and is being acted on.

I think, but I am not sure, that you are basically saying that ocamlc/ocamlopt are “woefully underdocumented”. I find this surprising / I don’t agree, they are covered in the manual, see the chapter on ocamlc for example.

Your different interpretation might come from the fact that the tools were written with a build model in mind, back in the nineties, that is inherited from C, pretty simple, but also very different from your own expectations. In that build model, compilation units are pairs of a .ml and .mli file in the same directory, and build artifacts are produced in the same directory. Compiled artifacts for compilation units are then linked together, in command-line order, into complete binaries or library archives.

5 Likes

No, I was specifically talking about the build part. So we disagree; I do not think those sections are good documentation, esp. for newcomers. Just as an example, what is a newcomer to make of -no-alias-deps? I’m not a newcomer and I’m still not entirely confident that I know what it means. (Please do not explain it, I use it as an example of underdocumentation.) Even more obvious, look at the titles of those sections. Native-code compilation is not batched?

I think the OCaml build protocols are about as complicated as it gets. It’s a gross oversimplification to boil it down to "

Even if you know what that means (“compilation units are pairs”? wtf?) it is of no help at all when you run into problems. I think I’ll try putting my implementations and my interfaces in different directories; kaboom! It should work why doesn’t it? If my module A depends on module B, why isn’t b.cmi always sufficient? Why on earth do I need b.mli? Do I always need to put my b.cmx/cmo in the -I path? Well, no. it depends.

Now, I’ll admit I’m a little biased. I don’t even want to think about the amount of time I spend figuring this stuff out in order to write Bazel rules for OCaml. I can’t count the number of times I thought I had it figured out only to find another use case that blew everything to smithereens. All of which could have been avoided if the documentation did not suck. Or maybe if I were a little less dim, but let’s not go there.

In any case I asked my original question because I’m not sure if I’m an outlier or not. I like to know what my toolchain is doing, which is one reason I’m not very fond of Dune. Others are quite content to ignore that kind of stuff. I just would like to know if others found this kind of stuff (as opposed to “how does the language work”) a barrier to using OCaml.

2 Likes

As an aside: in many cases the only way I could figure out how the toolchain was supposed to work was by examining the log file of a Dune build. Not because I could not understand Dune but because I could not understand the OCaml build discipline.

I’d like to push back on this being a typical newcomer use case. I am happy to believe that newcomers struggle with setting up a dune project since it is the recommended build tool and its documentation has a reputation for being opaque, but less so that they are going to want to delve into the depths of the OCaml compilation model to start off with.

1 Like

I didn’t say it was. I asked a question. Obviously only a survey of newcomers could tell us if it were typical.

Agreed, but given the context of this thread, you could see how there could be a misunderstanding.

Ok, we’re at 260 messages, let’s go for 300!

Here’s another way of asking more or less the same question:

The heart of Ocaml is the module system. Get rid of it and you have just another pedestrian FP language.

The module system is defined in the language. But for practical reasons programmers like to break their program code into file system units. So it is unavoidable that the language should address the mapping from language to file system. OCaml goes for the minimum (which fwiw I think is the right idea): a .ml file counts as a structure, and a .mli file counts as a signature. But no namespacing. That’s a big issue. So we get module aliases, which are complicated. What is the relationship between `module bar = Foo__bar" and the file system? It depends. Long story short, at a very basic level (source files, meaning both interface and structure files) you have a relatively complex story to tell the newbie about this module stuff. Not even counting things like: you don’t need the compiled implementation of a dependency,you just need its compiled sig, in order to compile something that depends on it. Unless you want cross-module optimizations, that’s different. Then there are includes.

Personally I think this kind of stuff is core to understanding OCaml. The original question is looking for reasons newcomers give up on OCaml. I would like to know to what extent this kind of stuff contributes to surrenders, as opposed to the non-module language stuff.

I’d be curious to hear what your opinion on Rust, or how you view its build processes if you’ve tried to use it.

From what I understand, barring the borrow checker, Rust is apparently a very beginner-friendly language — partially evidenced by the increasing number of front-end/web-devs who are now building utilities using Rust rather than nodejs.

However, as someone who’s tried to build tooling on top of Rust’s ecosystem, I’ve found the cargo build process to be as opaque, if not more than what dune might seem like to a beginner – it’s very hard to work out what cargo does in terms of low level build artefacts such as object files and libraries — I think rust has its own intermediate library format rlib that are completely hidden from the user under normal processes.

Have you looked at Rust? did you also find its build processes to be opaque? if so, then this might be a counter-example to show how knowing how the build-process works is important for being easy to pick up for beginners.

In fact I have, at least for building (but not writing actual code) and I think that’s an excellent point of comparison. I’ve seen many messages in various places on the web extolling the virtues of cargo, but I’ve never understood why. I started bazeling OCaml for the Mina protocol project which had a few rust dependencies. At the time (a few years ago) one used cargo-raze to generate bazel stuff from toml files. It was a major major PITA. I don’t remember the details, but my recollection is that toml files are (were) just kind of ridiculous. Or maybe it was crates that were preposterous, I don’t recall. What I do recall is that it was very very difficult to enable/disable features.

I wanna ask the folks who think cargo is so great: have you ever actually used the stuff in anger? Read the docs? Cause I did and I found it just as maddening and insane as your garden variety Java build system. Again, I don’t remember the details but the specification of dependencies was particularly galling, if I recall. Or maybe it was [features]. In any case it was very far from simple or clear.

I think maybe its a classical case of “works great until it doesn’t and then you’re in for a world of hurt”.

But to the larger question of what blocks new users: Rust has the advantage of funding, I guess. Tons of documentation - that stuff does not write itself. So even if cargo sucks as much as the next build system, well, good docs and examples can cover a lot of blemishes.

Golang is another good point of camparison. I dunno much about it, but I gather it is possible to derive an entire build metaprogram from the sources. I don’t think you can do that in Rust (not sure), but I know for a fact it is not possible even principle with OCaml, due to the open compiler option and module aliasing. Some kind of metadata is essential. Standardizing that kind of stuff might be a good step. For example from source code alone you cannot infer either namespacing (dune ‘library’ components) or archiving. Should the standard toolchain standardize such metadata? I dunno.

1 Like