What are the biggest reasons newcomers give up on OCaml?

Chet_Murthy · December 21, 2022, 6:46pm

Oooh, shiny! Thank you, Jon! I knew of Brzozowski derivatives (thanks for Jean-Christophe Filliatre), but these are even shinier!

jhw · December 21, 2022, 6:48pm

On the topic of modular implicits, I’d like to pass along that, the more I read and study how Scala 3 refactored its entire approach to contextual abstraction, the more impressed I am with the elegance of what they have done there.

Now that multicore runtime is mostly checked off the list of things about OCaml that deter potential adopters, I sense that similar efforts to improve the story around polymorphic overloading are likely to gain more traction. My hope is that whoever is leading that effort on the OCaml side will take a long hard look at what Scala 3 did there, because short shameful confession time: I’m seriously envious. It’s nice.

rdavison · December 21, 2022, 8:42pm

What (libraries xxx) corresponds to what Opam package.

If you use the esy package manager, there are some subcommands that print this out very nicely. For instance:

$ esy ls-libs
info ls-libs 0.6.12 (using package.json)
my_silly_package@0.0.1 [build pending]
├── @opam/async@opam:v0.15.0 [built]
│   ├── async.unpack_sequence
│   ├── async.persistent_connection
│   ├── async.log_extended
│   ├── async.lock_file_async
│   ├── async.async_rpc
│   ├── async.async_quickcheck
│   ├── async.async_command
│   └── async
├── @opam/async_js@opam:v0.15.1 [built]
│   └── async_js
├── @opam/async_unix@opam:v0.15.0 [built]
│   ├── async_unix.thread_safe_pipe
│   ├── async_unix.thread_safe_ivar
│   ├── async_unix.thread_pool
│   └── async_unix
├── @opam/core@opam:v0.15.0 [built]
│   ├── core.validate
│   ├── core.top
│   ├── core.base_for_tests
│   └── core
...

In fact, you can even get a list of the modules in each lib.

$ esy ls-modules
info ls-modules 0.6.12 (using package.json)
my_silly_package@0.0.1 [build pending]
...
├── @opam/notty@opam:0.2.3 [built]
│   ├── notty.unix
│   │   └── Notty_unix
│   ├── notty.top
│   │   └── Notty_top
│   ├── notty.lwt
│   │   └── Notty_lwt
│   └── notty
│       ├── Notty
│       ├── Notty_grapheme_cluster
│       ├── Notty_uucp
│       └── Notty_uucp_data
├── @opam/notty_async@opam:v0.15.0 [built]
│   └── notty_async
│       └── Notty_async
...

shonfeder · December 21, 2022, 9:30pm

I don’t think this is true at all. I’m also not even sure it makes sense to rank communities, is the Windows community (is it even a community?) better than the Mac community? Is the Twitter community (is it even one?) better than the community on this forum? If a nation is a community, is it obvious that the biggest nation is the best nation, or the nation with the best community? Is the JavaScript community better than the Rust community by virtue of it’s size?

Even if you substitute “software user base” for “community” I don’t think this claim is accurate. Most of the software (and hardware!) I opt for (Arch, Firefox, OCaml, i3, Emacs, etc.) is decidedly not the software in its class with the largest user base. But I use it because I find them to be the best for the purposes I have in mind, relative to my needs, values, and priorities.

I’m not anti-growth, but definitely in the camp that thinks growth should be sustainable. Afaict, “bigger is better” is just generally not true in any domain, and more often than not it is because the presumed criteria for comparing “better” and “worse” is so vague as to be “not even wrong”

edit: I’d also like to note that I say this not as someone who wants to gate keep or exclude people from OCaml or any other technology. On the contrary, most of my very modest contributions to the OCaml ecosystem have been in the form of documentation or small features to try making stuff I use more accessible, esp. for new comers to the language. I’ve also tried to help with some mentorship. I want use and develop extremely accessible and very good technology that helps empower people as much as possible.

shonfeder · December 21, 2022, 9:34pm

This will be a dream come true

UnixJunkie · December 22, 2022, 12:22am

Lack of taste.

  taste
     1. (primarily MIT) The quality of a program that tends to be
     inversely proportional to the number of features, hacks, and
     {kluges} it contains.  Taste refers to sound judgment on the
     part of the creator.  See also {elegant}, {flavour}.

mimoo · December 22, 2022, 1:28am

I would try discord, I always get good answers from there and the community is really nice : o

alan · December 22, 2022, 1:30am

The ecosystem is an obvious issue, and yes, it’s a chicken-and-egg problem.

For a project for a recent internship, I was given the freedom of using any language I wanted. I wanted to use OCaml, but due to missing functionality in one of its libraries, I ended up using Rust, which had a library that did what I needed. Then, for another project for the same job, I wanted to use Rust again, but in this case, Rust had a bunch of immature libraries and I wasn’t sure if they would be maintained for a long time. (One library I needed already seemed to be unmaintained.) Finally, I tried Python and found one big library that did everything I needed!

Given a choice between OCaml, an extremely elegant language with a smaller ecosystem, and Python, a more kitchen-sink language with a bigger ecosystem, if I need to accomplish a task for my boss so my company can meet business requirements, I would choose Python. This is a lesson that I learned in my internship.

The irony is that for a long time, Python was not popular, but then it shot up in popularity. With its new popularity came new libraries, a self-reinforcing pattern. See this SE Stack Exchange question for explanations of Python’s popularity. I think that OCaml has the potential to repeat the success of Python. To quote the bullet points of the answer by nikcub on that page:

FCGI happen, and then WSGI. Prior to that you had to run Python scripts as ordinary CGI, which was not fast enough. mod_python was nowhere near as good as mod_php, the .NET CLR or the Java platform JIT VM.

Prominent Universities began teaching algorithm and other classes using Python, and book like ‘Learn to think like a Computer Scientist’ were published and became popular.

It became a top-tier implementation language at Google earlier in the decade, and this had an impact in how seriously it was taken.

Visible developers and standards developers, such as Joe Gregorio and Mark Pilgrim were both using Python to implement the prototypes of the Atom protocol. Pilgrim then wrote DiveIntoPython which helped a lot of people learn and pick up the language.

The 2.x branch became stable and implemented features such as Unicode support, good XML parsing, a new Garbage Collector, generators and functional methods, etc.

The biggest tipping point was Django - which became very famous along with RubyOnRails around 2005. The Django philosophy differed to that of Rails, and a lot of developers found it more suitable for projects.

OCaml is taught at prestigious universities like Cornell.
OCaml is used by prestigious companies like Jane Street.
OCaml is used for the WebAssembly reference interpreter.
OCaml 5 is now out and has multicore and effects!

As companies such as Jane Street and projects such as Mirage use OCaml, they naturally add new useful libraries to OCaml’s ecosystem. (However, I try to avoid Jane Street libraries because of their dependency on Core or Base - I find Jane Street to be “contagious” and prefer to write libraries that aren’t as opinionated in preferring a specific section of the ecosystem.) However, what amazes me is that Rust, a younger language, already seems to have a bigger ecosystem than OCaml does, perhaps because of the many Rust enthusiasts who go out of their way to write useful libraries. Why don’t we compile a list of libraries that we wish existed in OCaml?

However, I would add the caveat that many Rust libraries I found seemed incomplete, immature, or unmaintained. I hypothesize that this is a problem if the library is written by a hobbyist, who may move on from the project, while if the library is necessary for business needs, it will have pressure to be complete, correct, and maintained.

mimoo · December 22, 2022, 1:35am

btw Cargo has this cool thing where it’ll first look for a build.rs and run that (potentially generating files) before anything else. This is used quite a lot in all sorts of ways, and would be a nice way to avoid writing rules in dune.

In general most of my gripes with dune is that it’s like writing Makefiles: it uses very few conventions, and doesn’t make much assumptions about the directory structure or the name of the files. It’s too flexible and so everybody uses it differently and every OCaml project is structured differently. Ideally, the build system should be as invisible as possible and just work™.

jaxon · December 22, 2022, 2:34am

And that is plain awfull and will cange the environment for OTHER programs running. This is MS DOS way, when you only run one program at a time, never two and God Forbid more then one user on the machine.

It is like have global variables in a program.

timmy_jose · December 22, 2022, 5:25am

Fully agreed. The less the friction, the better it is, especially for beginners.

bluddy · December 22, 2022, 6:59am

This is exactly right, and it’s why I say bigger is better (in language ecosystems). Bigger ecosystems means more users, with a higher chance of some of them contributing critical libraries and apps that attract more users. With the added attention comes more money, more money means more core developers which then lead to more advanced language features. It’s all about feedback loops. Smaller language ecosystems, on the other hand, are always in danger of shrinking and disappearing.

Sorry for not being clear. I was specifically talking about programming language communities. And all other things being equal, yes, Javascript is a stronger community than Rust’s, but they’re also not direct competitors (at this point). What’s more interesting is the C++ community vs the Rust community. Rust’s mission is to convince low-level programmers to switch to it (abandoning C++), and C++'s is to advance with new features and defend its turf. Currently Rust’s strength is rising due to an excellent feature set and good community management, and we’ll see where it ends up.

Graduating from a niche community into a large, well-known language community is a huge achievement, and Rust has accomplished that in a very short period of time. This all results in the kinds of feedback loops that will make Rust more and more dominant in its space (assuming the competition can’t fight back sufficiently well). We should all be hoping OCaml does the same, and in fact, we were on that path IMO while the ReasonML community was flourishing.

If nothing else, growing to a large community means OCaml programmers can actually go out there and find jobs, just as Haskell, Scala, C# and Rust programmers can. That’s a huge deal. Bigger is better in language communities - look how many Javascript jobs are available in industry and how few jobs are left for other programmers nowadays. Not better in some kind of value judgement way – just better in the sense that it’s what we need to strive for to thrive as a language.

This is precisely the value of being opinionated. dune had to convince legacy projects to switch to it, and as a result, it had to add features to support different variations of existing projects. If you were following OCaml before the dune era, you’d know build systems were a giant mess, and it seemed like there was no way out. Hopefully we can move in a more opinionated direction now that dune is standard.

BTW It’s not enough to be opinionated – it’s also important to have conventions that match user expectations, and those are often set by other languages. ocamlbuild, one of the tools that preceded dune, was opinionated, but also had unintuitive conventions given what people were used to. The same can be said of dune’s sexp format – it’s opinionated in the wrong way, since now people are required to learn an unfamiliar format to use the build system.

dra27 · December 22, 2022, 7:39am

What other programs? It’s changing the environment for the console window you’re in, just as eval $(opam env) does?

dbuenzli · December 22, 2022, 8:10am

Since this thread is becoming boring (as all these kind of threads eventually do) and that this sub discussion is coming again :–) I’d just like to mention to @Jon_Harrop who raised it initially that apparently nowadays depending on the shell you are using you don’t need to eval $(opam env).

I suddenly realised I hadn’t issued that command for a while and it seems to be due to my OS changing the default shell to zsh and some magic you can read about in the various env_hook.* scripts here. Unless you are using a shell that doesn’t have magic, maybe try a opam init --reinit to install the proper hook.

grayswandyr · December 22, 2022, 12:27pm

I have been teaching so-called advanced functional programing in OCaml for several years to “Bac+4” students (= first year of MSc) in a French grande école. Apart from the deep issue of being exposed to another way of thinking (w.r.t. programming languages featuring things such as pointers, imperative programming, OO design, untyped or explicitly-typed…), which is good, the more mundane but important issues for students I can think of are:

a syntax that is not uniform for the most fundamental operation of the language: application. Function application is generally curried, data constructor looks like application (and semantically is kind of) but it looks uncurried, functor application uses parentheses, polymorphic-type application is postfix…
when teaching GADTs, you can’t define a function as usual: introducing the type a b . ... annotation forces to declare the function type, then parameters on the RHS of =, under the fun keyword (this is unsettling for students)
no good history on namespaces, open is too coarse and let open Long_module_name in or Long_module_name.( ... ), something like open Long_module_name.( f1, f2 ...) would be nice
dune requires 3 files just to compile a basic file and launch utop
dune utop is in fact quite long to launch utop (several seconds on my not-so-old laptop for a single ml file, without libraries apart from ppx_inline_tests)
apart from the fact that ocamlformat is less and less customizable and a bit too opinionated in my mind, auto-formatting (which is quite good to show students how to format code) is not run if you haven’t a fourth file (.ocamlformat, even empty)
by default, dune explores all the hierarchy upwards when you invoke it until finding a dune-workspace file. Needless to say, students often have buggy files everywhere in their hierarchy and then, forgetting to create a dune-workspace file, see dozens of errors that seem incomprehensible to them. (For people who discovered computers through the terminal, like me, notice that a lot of fresh students do not even know how to explore a file hierarchy graphically, let alone in the terminal! Blame smartphones and their search tools that incite students into seeing the file system as a blob rather than a tree, although CS students who lack curiosity w.r.t computers and OSes can certainly be blamed too.)
the default “dev” profile of dune treats warnings as errors, which is too strict for students (e.g. an unused variable, so common in student programs, yields a warning). But we can’t use the “release” profile because it switches testing off, and we encourage our students to test their code. So we must teach our students to update the dune-workspace file to add a specific clause (env (dev (flags (:standard -warn-error -A)))), whose syntax no one can remember.
dune runtest doesn’t show anything if things go well, not even something like “all tests ok”, so students are a bit lost
VS Code, by far the most used tool nowadays, has a good OCaml mode, but still lacking: some code is often shown as badly typed while it’s correct (as can be checked by running dune build on the command-line); types shown by the plugin sometimes differ from those shown by ocamlc; the most important refactoring operation, namely renaming an identifier in context (= taking all sorts of binders into account) is AFAIK impossible; code formatting doesn’t seem to agree with dune build -w @fmt --auto-promote
eval $(opam env) is incomprehensible for students (and actually often needed)
the fact that there are several programs to get you running (mainly opam and dune), with different command-line syntaxes, although comprehensible, adds some unwanted complexity in the eye of newcomers
finally, the elephant in the room: error messages are far better than they once were. But there are still some issues. The most frequent errors, apart from syntax errors (the location of which is always difficult), are type errors. I think terser, more emphasized (using colors and/or font weight), more to the point, messages could entice students to make the extra mile to try to understand their errors (instead of calling the teacher immediately). Also, I seem to recall that type errors in the context of modules and functors can be very long while in fact they resolve to a small mistake concerning only one function in a large module body. Finally, once again, the VS Code plugin typer does not always agree with the compiler typer…

It looks like most of these issues are actionable and could be addressed reasonably easily by tooling developers in the know.

(I’m not expecting anyone to implement a new, uniform syntax (item 1) but I would personally welcome this breaking change, together with a tool to convert the “old” syntax to the new one (I’m old enough to recall it already happened, although for smaller-scale changes).)

Jon_Harrop · December 22, 2022, 2:18pm

FWIW: 0.3s on my M1 Macbook Air.

grayswandyr:

dune requires 3 files just to compile a basic file and launch utop

…ocamlformat…is not run if you haven’t a fourth file (.ocamlformat, even empty)

by default, dune explores all the hierarchy upwards when you invoke it until finding a dune-workspace file. Needless to say, students often have buggy files everywhere in their hierarchy and then, forgetting to create a dune-workspace file, see dozens of errors that seem incomprehensible to them. (For people who discovered computers through the terminal, like me, notice that a lot of fresh students do not even know how to explore a file hierarchy graphically, let alone in the terminal! Blame smartphones and their search tools that incite students into seeing the file system as a blob rather than a tree, although CS students who lack curiosity w.r.t computers and OSes can certainly be blamed too.)

…update the dune-workspace file to add a specific clause (env (dev (flags (:standard -warn-error -A)))), whose syntax no one can remember.

dune runtest doesn’t show anything if things go well, not even something like “all tests ok”, so students are a bit lost

…code formatting doesn’t seem to agree with dune build -w @fmt --auto-promote

eval $(opam env) is incomprehensible for students (and actually often needed)

the fact that there are several programs to get you running (mainly opam and dune), with different command-line syntaxes, although comprehensible, adds some unwanted complexity in the eye of newcomers

You mention files a lot. Here is a wild idea that risks offending the elderly dendrophiles but what if there was an OCaml programming system that wasn’t file system based and wasn’t dependent upon CLI tools and all the configuration files were replaced with a GUI? Perhaps web based using Monaco (the editor component from VSCode). All of your code is always available and searchable.

lukstafi · December 22, 2022, 2:22pm

There used to be a project G’Caml extending OCaml with ad-hoc polymorphism. I believe such a language would be more newcomer-friendly, but it’s not the OCaml way.

grayswandyr · December 22, 2022, 2:48pm

Fresh OCaml 5 install (it was the same with 4.14): Ubuntu 22.04; one ML file 0f 200 lines, no lib except ppx_inline_test: 7s for the first run of dune utop (most of the time is spent in the build of utop.exe), then 0.3s for subsequent calls (without rebuild).

I tried to explain what are current user-facing issues and exhibit actionable low-hanging fruits to improve the current state.

Your idea is a whole other issue. In any case, I would still expect CS&SE-major MSc students of a highly-selective college to fluently use basic commandline shell tools, think of their filesystem as a tree and realize how a program is built (compiled and linked). BTW this is a reason why I don’t want to use TryOCaml in my teaching, even though it’s very nice (I also fear students to feel like OCaml is a non-realistic toy, which they already think a little as it’s not mainstream).

bluddy · December 22, 2022, 3:29pm

I agree that this is an example of a terrible opinionated choice by dune devs. I add this line immediately to a new project, and of course have to look it up every time.

jbeckford · December 22, 2022, 4:18pm

Can you (@bluddy) or @grayswandyr file an issue in GitHub - ocaml/dune: A composable build system for OCaml. asking to change the default in new version of Dune?
I’ve opened Change default to auto-install hooks on *nix · Issue #5401 · ocaml/opam · GitHub to change the opam init default so that eval $(opam env) is not needed by default.

Topic		Replies	Views
What I dislike about OCaml Community ocaml	117	11667	November 5, 2022
Blog: General thoughts on Ocaml & Haskell and OCaml's (supposedly) pathetic state of tooling Community opam , dune	51	8756	August 26, 2021
Why is building Ocaml projects still so hard? Ecosystem	39	3024	August 22, 2024
Usability improvements in the OCaml compiler Community compiler , usability	0	777	February 5, 2023
What is holding you back from upgrading to the latest OCaml compiler? Ecosystem compiler	28	4178	May 16, 2019

What are the biggest reasons newcomers give up on OCaml?

Related topics