What are the biggest reasons newcomers give up on OCaml?

jhw · April 26, 2023, 3:05am

I too gave up on local opam switches a long time ago. Never bothered to figure out why my intuition about them is all wrong. (At first, I thought the problem was that I use Mercurial instead of Git, but turns out that’s only one of the problems.)

I sure hope we are not advising OCaml newbies to attempt using them because I think that would go poorly.

jake · April 26, 2023, 5:19am

Very cool! Thanks for sharing that. I’ve been thinking about writing something similar along with emacs functions to invoke it.

One thing I realized today is that because merlin works off compiled modules, most experienced ocaml devs probably keep a window open running dune build --watch most of the time they’re coding. I knew about the feature but assumed it was an enhancement. I didn’t realize that merlin only provides help based on compiled modules (which makes sense), so --watch becomes a fairly critical piece of the emacs/vi development flow.

Unbound module in general has definitely been an intimidating headwind in learning ocaml. This week I couldn’t figure out why Timere.Timedesc was unbound, until I figured out that Timedesc is its own top-level module and not a sub-module of Timere even though it’s installed as part of Timere. Timedesc is listed in Timere’s page of “Package descriptions”, which is what tipped me off that Timedesc might be a top-level package of its own even though it’s installed with the Timere library (and that “library” and “package” might be different things. Are they officially different things? This page’s title claims to distinguish the terms but doesn’t: Packages, Libraries, and Modules).

And in retrospect I feel silly for not realizing that sooner - it says “Package descriptions” right there on the page! And that’s the thing - all of these little things along the way could be correctly answered with “RTFM”, because the answers are all in the docs if you can piece it all together, but there’s enough FUD throughout the process that it’s never clear whether it’s time to ask the internet or if I’m just an idiot who can’t put two and two together (both may, of course, be true at once). Each quirk makes sense once I figure it out, but there seem to be a lot of them on the road to being able to type some code, compile, get inline help from merlin in emacs, and understand how to interpret the errors. I sheepishly admit that it once took me a week to realize I needed to restart emacs to get merlin to work rather than just calling elisp fns. I don’t often restart emacs. And I’m sure if I used VS Code instead everything would work more out of the box, but I like emacs.

One detail that makes this road more intimidating is that virtually all of the ocaml internet, including stack overflow, provides answers in terms of ocamlc, ocamlfind, and other tools I don’t directly use and hope not to directly use because dune provides an abstraction over them. So the answers are useful but only if you squint at them.

To perhaps overgeneralize the problem, the biggest impediment to this newcomer adopting ocaml has been that the tools and libraries seem to be built in the functional tradition to an extreme - lots of small things that you can compose beautifully into something bigger, if you can glean how they click together. Ocaml libraries’ mli files can be illuminating, but sometimes not quite enough to start using them. If TyXml didn’t have examples and a tutorial page, I would not have been able to infer its usage from its types. I still only approximately understand its compiler errors (admittedly I’ve only been tinkering with it for a couple evenings so far).

That also makes shopping for a library hard. Without a front page of example code giving a gist of how it all goes together, it can take a fair bit of work to figure out which one you want to commit time to learning and using.

Dream’s home page is such a breath of fresh air. Immediately clear and usable. And in terms of information, it isn’t much more than an mli file and some sample code, just presented beautifully and in a digestible order that tells a clear story. It’s downright inviting. Imagine if all ocaml libraries’ home pages looked like that?

This turned into much more text than I expected. Thanks for reading.

jbeckford · April 26, 2023, 8:34am

I’ve been hacking for a while to see if this could be solved. If you expand or click on Instructions: Use this module in your project in https://diskuv.com/DkSDKBook_Std/zzref/lwt/Lwt_timeout/, you’ll see four (4) steps for opam and dune. The steps start with opam install lwt.5.6.1 and ends with adding lwt.unix to your (libraries). Is there a relationship between the package lwt and the library lwt.unix and the module Lwt_timeout? I have a hard time figuring it out! And I could have picked worse examples.

Please skip over the DkSDK part of the page for this discussion. What I’d like to see is that on every module documentation page the instructions for how to use the module. (It is not enough to put the instructions on the package page because of the 1 package : L libraries : M modules relationship).

If people think something like what I have on my doc pages will mostly solve the Unbound module problem for beginners, perhaps we can lobby that the feature goes on v3.ocaml.org (or perhaps odoc first).

Leonidas · April 26, 2023, 12:38pm

As the PR author… yes it does, but the issues discussed in this thread aren’t really issues with the OPAM API but the CLI. Except for some wonkiness between dune/findlib package names and OPAM package names one could design a straightforward UI that will do (most of) the things that people expect from package management in 2023, but the OPAM CLI is just bad at this as it comes from a bit of a different philosophy (that’s why all the invocations that make it work nicely e.g. local switches, pinning, dependencies in sync with OPAM files, etc are rather hidden behind complex sets of options).

That said, the linked PR is only the very beginning of this and it landing is just step one of many pieces necessary to make the situation better, with efforts of both Dune and OPAM maintainers. But I do hope that the effort of dune driving OPAM will lead to a less confusing UX that should make dependency management more straightforward.

@jbeckford: You could also tell users to add the lwt dependency to their .opam file and tell opam opam install . --deps-only --with-test or something, basically to sync their switch with what the opam file declares. That’s one less place of repeating Lwt instructions at the cost of a semi-cryptic opam incantation.

jake · April 27, 2023, 1:46am

Those instructions looks very useful for setting things up. The problem with step-by-step instructions, though, is that if one goes wrong, the user is a bit stuck to debug it.

What would be more useful for me, in figuring out why Timere.Timedesc was an unbound module, is a low-effort, accessible blueprint for how the pieces click together for any opam+dune project. I know dune has a manual, so RTFM, but for me it’s a pretty challenging FM. Too many things to hold in my head at once to figure out how things click together. Lots of well defined pieces in an algebra that composes beautifully if you can connect the dots.

dune’s documentation says (General Concepts — Dune documentation)
"
For libraries that are part of the installed world, or for libraries that are part of the current workspace but in another scope, you need to use the public name.
"

So to debug my unbound module Timere.Timedesc, if I’m pretty sure it’s not something to do with my opam switch (do I need to run eval $(opam env)? Is emacs somehow not picking up that environment?), I need to determine what public names timere exposes.

In Timere’s dune-project file, it defines a set of packages, one of which is called timedesc. It does not define one called timere.

Its name in the dune-project is timere, but according to the manual, the name “Sets the name of the project. It’s used by dune subst and error messages.” I didn’t realize this the first time around, but that’s all it’s used for, i.e. it’s not related to my project’s dune files.

github.com

daypack-dev/timere/blob/main/dune-project#L14


      
          (authors "Daypack developers")
          (maintainers "Darren Ldl <darrenldldev@gmail.com>")
          (source (github daypack-dev/timere))
          (homepage "https://github.com/daypack-dev/timere")
          
          
(generate_opam_files true)
          
          
(name timere)
          
          
(package
            (name timedesc)
            (synopsis "OCaml date time handling library")
            (description "
          Features:
          
          
- Timestamp and date time handling with platform independent time zone support
          
          
  - Subset of the IANA time zone database is built into this library
          
          
- Supports Gregorian calendar date, ISO week date, and ISO ordinal date

I know that a package corresponds to a subdirectory, so here, in the dune file for the timedesc directory in that repo, I find its public name.

github.com

daypack-dev/timere/blob/main/timedesc/dune#L10


      
          (rule
           (targets time_zone_constants.ml)
           (deps    ../gen-artifacts/time_zone_constants.ml)
           (action  (copy %{deps} %{targets}))
          )
          
          
(library
           (flags     (-w "+a-4-9-29-37-40-42-44-48-50-70@8"))
           (name timedesc)
           (public_name timedesc)
           (instrumentation (backend bisect_ppx))
           (libraries
                      timedesc-tzdb
                      timedesc-tzlocal
                      unix
                      ptime
                      angstrom
                      seq
                      bigarray
           )

Aha! I’ve connected two dots in the blueprint (I actually did connect those two dots tonight while writing this). That public_name in the timedesc package’s dune file goes in the libraries stanza in my library’s dune file to use it. And it is then available in code as a top-level module, and merlin recognizes it right away.

So it’s all learnable and inferrable. It just takes patience and commitment. You gotta really want to use ocaml.

And of course, I see now that in the midst of my rabbit hole I didn’t realize that Timere’s readme on github says to add timedesc to your dune libraries stanza. From that I would have understood that I could then refer to Timedesc as a top-level module, not Timere. Not sure why I missed that; either impatience or cognitive load, or I’m just a fool.

One dot not yet connected in this path - is the package stanza’s name always the name of a subdirectory of the one containing dune-project? And if so, what is the relationship between that stanza’s name and the name and public_name in the directory’s dune file? Does the package name have to match one of them? Which one?

dangdennis · April 27, 2023, 4:14am

If you’re asking about name and public_name, check this page out.
https://dune.readthedocs.io/en/stable/dune-files.html#library

name in a dune file:

<library-name> is the real name of the library. It determines the names of the archive files generated for the library as well as the module name under which the library will be available, unless (wrapped false) is used (see below). It must be a valid OCaml module name, but it doesn’t need to start with an uppercase letter.

public_name in dune file:

(public_name <name>) - the name under which the library can be referred as a dependency when it’s not part of the current workspace, i.e., when it’s installed. Without a (public_name...) field, the library won’t be installed by Dune. The public name must start with the package name it’s part of and optionally followed by a dot, then anything else you want. The package name must also be one of the packages that Dune knows about, as determined by the .opam Files

So it’s the public_name of your lib that should match the one in the dune-project.

I always refer back to Stanza Reference — Dune documentation whenever I have to understand some dune file. The information around name and public_name should probably surface earlier. Maybe add the info somewhere into OCaml By Example | <fun>, and then surface this ocaml-by-example page more readily.

adl · May 5, 2023, 12:32pm

Me and my team have been trying out OCaml for a few months. As much as we like OCaml, the big blocker for us is the poor Windows support. We need our application to run on both Linux and Windows (without WSL2 or Docker). At least to me, it doesn’t seem trivial to build our application for Windows today. I’ve seen that the next version of opam will support Windows as a Tier 1 platform but it’s hard to know when this version will be released. The website says it will be released “in the coming months”.

nojb · May 5, 2023, 1:50pm

As a data point, at LexiFi we develop in and deploy our applications on Windows (using the MSVC compiler), and in our experience OCaml itself, as well as Dune, have excellent Windows support.

What doesn’t work as well is anything that has to do with OPAM, but we don’t care much for it: we develop in a monorepo and if we need a third-party library (in general we try to keep our dependencies pretty minimalistic), we simply vendor it in our tree.

As development environment we use Cygwin when we target Windows (Cygwin is also needed to compile OCaml itself on Windows), and WSL when we target Linux. We have been using this setup for years and it is pretty robust. As far as tooling is concerned, some of my colleagues use Merlin and OCaml-LSP for developing and these also work fine on Windows.

Cheers,
Nicolas

jbeckford · May 5, 2023, 3:44pm

I’d like to know the top one or two problems from your “poor Windows support” category. It may help with prioritization.

rikusilvola · May 5, 2023, 6:35pm

Hello!

While I won’t make any promises, we are confident to have an alpha release very soon.

Having opam support Windows has been a long-time dream of many. The team shares that dream and is actively working to make it come true. The road to this alpha has been long and we appreciate everyone’s patience. We have come a long way, but there is still more work to be done before we can announce anything.

ivan-kleshnin · August 16, 2023, 8:05am

There are many quirks in OCaml but this one is really detremental.

Types and implementations live in separate files. - https://github.com/ocaml/ocaml/blob/trunk/stdlib/array.ml vs https://github.com/ocaml/ocaml/blob/trunk/stdlib/array.mli .

OCaml devs argue “it helps them to quickly scim over the API surface” but it’s something IDE/editor could help with. By providing an “API view” mode or something. Like in Webstorm there’s a Structure Tab, nothing fancy.

I don’t have an ideal memory, so I want to see types after I “skim over the docs”. OCaml sources look pretty much the same as Python, Ruby, Erlang, etc. dynamically typed PL, where you have to constantly guess “what type is this, or that”. It’s even worse as comments are also kept in the .mli

I don’t get this convention at all. Types are the best documentation and have to be colocated with function bodies. AFAIK no other statically typed language follows the same weird approach of separated types.

nojb · August 16, 2023, 8:15am

Apart from documentation aspects, interface files play a technical role in the implementation of separate compilation. Besides, if you included the interface in the same file as the implementation, where would you put it? At the top of the file? Then you would still need to jump back and forth to read the types…

Turning your argument around, this is something the IDE could help with. In fact, this already exists and is provided by the VS Code LSP integration.

This feature was not invented by OCaml, it was borrowed from Modula-2.

Cheers,
Nicolas

silene · August 16, 2023, 9:28am

To be more precise, while Modula-2 had the notion of separate definitions and implementations, you still had to write the full type of your functions/procedures inside the module implementations. So, even in Modula-2, function types were colocated with functions bodies.

bluddy · August 16, 2023, 10:30am

It would be nice if you could choose for the LSP integration to update the corresponding mli file. I think this is the pain point - having to update 2 places every time you change something. That’s why I tend to avoid mli files until I have to use them. I also don’t like not having comments when browsing the .ml file. One could imagine a ppx extension that automatically recreates the mli files based on whichever functions and types you choose to make [%public ].

C and C++ require separate header files.

Think of this as a feature. OCaml can completely hide what the module does – all you need to use is the mli file. Of course, this is assuming the documentation is good enough, which it often isn’t.

silene · August 16, 2023, 3:27pm

And yet, despite having separate header files, function types are colocated with function bodies in C and C++.

Note also that, in modern C++ (I.e., C++20), header files have been obsoleted by so-called module files, which pack declarations and definitions into a single place. In other words, you no longer have any .hpp files that you #include, only plain .cpp files. (Header files are still available if you need fancy macro preprocessing, e.g., the configure.h.in machinery from Autotools. And obviously, for backward compatibility, header files will still be supported for a very long time.)

That said, there are two use cases in C++ where you can physically split a module declaration from its implementation:

You want to obfuscate the implementation and distribute it only in binary format, in which case a separate module declaration needs to be distributed in text format.
The module implementation is so large that you need to split it over several .cpp files, in which case one of the .cpp files has to export (and thus duplicate) the declarations of all the other .cpp files (assuming these other files actually contain public definitions).

Note that the notion of C++ namespaces is completely orthogonal from the notion of C++ modules, which means that, even if a namespace is huge and needs to be split over several .cpp files (e.g., the std namespace), as long as its components are small enough to fit into separate files, declarations do not need to be duplicated and providing module implementations is sufficient.

glen · August 16, 2023, 7:38pm

Decoupling interface from implementation is both a technical necessity and a good habit (abstraction, documentation). However @ivan-kleshnin’ complaint still holds: in OCaml it is very common to find code with one-letter variables (at times meaningless), not a single comment and not a single type. That’s the style promoted by most tutorials, and by the language itself. Personally I find it absolutely awful: a code is not elegant because it is (syntactically) shorter. Type inference is cool, but it’s not because rocket science does allow us to omit every type info (except when it doesn’t) that we should. Explicit types document the code. Explicit types clarify intention and make error messages much more meaningful. I regard as good habit annotating at least the named functions. And writing comments, of course. To me, clear and self-explaining code is 10 times as valuable as any one-liner put together to impress other people with the author’s skills and/or expressiveness of the language.

jfeser · August 16, 2023, 8:30pm

I agree this is a good place for tooling to help out. A few thoughts on ways that the lsp server in particular could make working with mli files easier:

Code actions for adding/removing entries from the mli when working in the ml file.
Inlay hints that show the mli type and/or documentation in the ml file.

I don’t think either of these features would be too hard to implement, so maybe someone will be inspired to give them a try.

ivan-kleshnin · August 21, 2023, 7:06pm

Turning your argument around, this is something the IDE could help with. In fact, this already exists and is provided by the VS Code LSP integration.

That’s true. The difference is that:

“explore API” is something you do once in a while (if ever – won’t replace a good documentation).
“look at types” is something you do constantly.

The lack of 1. is tolerable and this mode is fairly easy to implement in any editor. The lack of 2. is a game stopper and this mode is much more challenging. Every bug, lag basically ruins your flow and productivity.

I can imagine the whole thing is eventually resolved with a very smart IDE, which inlines types and intercepts changes to pipe them to .mli like @jfeser mentions… doing all that in a seamless and performant fashion. I can imagine that. But then another obstacle comes out:

Code reviews with plain diffs. How I’m expected to review pull requests on Github / Gitlab that are not that “smart” IDE and likely never will be? So we’re back to the initial point.

bluddy · August 22, 2023, 6:12am

I think you need to get into the mindset of “OCaml does things differently here”, and “different is not necessarily bad”. OCaml follows the ML tradition, in which function type annotation is considered unnecessary. In theory, you’re not supposed to worry about type annotation much. This is very different from Haskell and Rust’s philosophies, where every function needs to be annotated – often because the type system is too complex to infer everything by itself.

The reason we even have types listed in .mli files (or module types in general) is because modules need to interface with each other in separate compilation, and for that they require types. It also makes sense that your external interface towards the world is where you care about making the types nice and clean – not within the module, where types can be fluid and shift around, and it’s fine so long as the module ends up compiling.

Again, different, but different doesn’t necessarily mean bad.

dbuenzli · August 22, 2023, 8:03am

I agree with your comment “different is not necessarily bad” and I will be the first one to defend separate .mli files but not with bogus arguments :–)

You can perfectly have separate compilation and interfaces while having everything in the same file. For example by having private and public annotations in your language. Formally separate compilation does not happen trough .mli files but through cmi files which could correspond to an .ml’s public annotations.

That being said .mli files are extremely nice for what one ends up (or should end up) doing most of the time which is: API design. Being able to work solely on the signatures and the documentation of your API without being bothered by the private definitions and the concrete implementations is extremely nice in practice.

Topic		Replies	Views
What I dislike about OCaml Community ocaml	117	11669	November 5, 2022
Blog: General thoughts on Ocaml & Haskell and OCaml's (supposedly) pathetic state of tooling Community opam , dune	51	8756	August 26, 2021
Why is building Ocaml projects still so hard? Ecosystem	39	3024	August 22, 2024
Usability improvements in the OCaml compiler Community compiler , usability	0	777	February 5, 2023
What is holding you back from upgrading to the latest OCaml compiler? Ecosystem compiler	28	4178	May 16, 2019

What are the biggest reasons newcomers give up on OCaml?

Related topics