Configuring a project directory and setting up the related toolchain (opam, dune, etc.)

My focus is on using Ocaml tools for coding programs.
I’m far away from my first “Hello World!” in OCaml. And I think I’m also quite far away from being able to create an OCaml program for Real World.

Learning OCaml language (stdlib), setting up a working OCaml editor/IDE, evaluating which stdlib replacement was a quite long and painful story (and is still, especially regarding stdlib replacement which highly conditions the way of writing an OCaml program).

I’ve installed Opam, Findlinb and Dune.
Now, I’ve read most of their documentation and experimented them with the CLI on sample programs/projects. I’ve also read various articles (too often not up to date or with errors making learning difficult).

Where is specified the way an ocaml project must be structured?
I mean how to structure the source code tree within a bin/, a src/, a lib/ directories, with the configuration files (mainly META Findlib files, opam or .opam Opam file, and the many dune files)
Are they some templates generated by the tools?
How far is it automated and which information should be added manually?
This is far from being obvious, especially because of the fast moving transformation we encounter.

For example, if I want to program in OCaml, I need a compiler and all individual compiled modules whether they come from a library, a package, or some of my modules possibly gathered in a library.I must use tools for that, mainly Findlib and Opam. But this will not prevent me from trouble.
To avoid “known problems” ( discussed regarding how Opam, Dune, and Findlib are naming and using sets of “resources” @Leonidas recommends to create a local switch in the project and to to use an opam file that will trigger the installation of packages:

As a stopgap solution I recommend using clean switches per project and only install packages via dependencies declared in the opam file and avoiding to call opam install … directly

See https://discuss.ocaml.org/t/connection-between-libraries-in-opam-dune-and-findlib/2536/6
This is possible with Opam 2 ($ opam switch create foo <package-or-version> then $ opam pin edit foo to edit it).
Thanks for that. But Hu!! Imagine the pain for a complete beginner stuck in the middle with a recommended OCaml setup…

.merlin files?
We are asked in the documentation to write .merlin files so Merlin can do its job with our libraries. But in fact, Dune generates .merlin files. Great. Now I know!

A build system for OCaml programs?
Dune seems to be a promising one.
But do people write manually all their dune files distributed in their source trees? This seems very error prone.
Is there a Dune function such as the Opam $ opam pin ... for editing all the dune files?

Is it possible to define in one place all the dependencies for a program (or project) with real-time checking (especially regarding modules/libraries/packages compatibilities)? Once it’s done, we can focus on writing .ml and .mli files and just test it.

And we also have/need the files README.md, LICENSE.md, CHANGES.md, etc.
Is there a template and checking tool for that?
And of course, we may have configuration files for Version control (Git) and CI tools (Travis).
That’s a bunch of files.

I’ve spent about 3 hours working with ocaml today because I thought F# and reasonml were so cool and I am stunned that I can’t find any resources for this as well

I’ll try to answer what I understand:

There is none at the moment, because the tooling doesn’t really require any specific structure and can work with any structure you like. There are some conventions (e.g. bin for executable entry points, and src/lib for the rest of the source) but they are not strictly followed by everyone. I think an authoritative convention could simplify things but we would need some official push from the OCaml team for that.

I think so, yes. I write them manually. Curious, could you explain why you think it is error prone? Dune files can also be as short as needed if the directory doesn’t contain anything “weird” such as custom aliases, targets, and whatnots.

I don’t think it’s possible now, but this could be a good feedback for the Dune maintainers if they are not aware of it already. The way to add dependency to your project is 1) adding the package to your opam file, 2) opam install --deps-only (if you haven’t pinned) or opam update && opam upgrade (if you have pinned), and 3) add the package to your dune file.

I don’t think there’s a template and checking tool, but they are quite common files for open source projects, so they’re not really specific for OCaml. If you’re using topkg for managing your distribution (you don’t have to, but it’s a nice tool), it has several commands such as topkg lint that can ensure that some conventionally required directories and files are present, but even those are only very simple checks.

1 Like

I’ve been using this https://github.com/jordwalke/pesy to generate projects. It uses a tool called https://github.com/esy/esy that creates a sandboxed environment with opam, dune, and other tools wired up.

2 Likes

I maintain a minimal hello world OCaml project that I clone when I want to start something new. It’s probably not the most idiomatic so please raise pull requests to improve it. If the community develops a strong opinion about the structure of projects, it could embrace something similar as a recommendation. It seems to me that this is easier than developing tools that create such a structure, albeit a bit less flexible.

5 Likes

How did you came to Ocaml? (from which programming language, etc. ?)

This issue seems trivial as the discussion shows it (“we write this by hand”, “pls. see an example project tree”).
And at the same time it’s crucial because if you don’t fully understand the packages&libraries installation system (Opam) and the build system (Dune), sooner or later you’ll be stuck in the middle.

Thanks for your feedback.

A specific structure would be helpful for beginners.
Anyway, when s.o gets to enough knowledge for doing manually compilation and linking with ocamlc/ocamlopt (and ocamlfind for its ease of use), he can simply use Opam for libraries-packages management. I think it’s a good thing to first know how to manually do the job before using tools.

First, things are clearer today thanks to the different discussions about code organization and tools and stdlib replacement. There are still questions regarding tools that are discussed in parallel (Opam with a local switch).

Concerning Opam AND Dune files for a same program/project things are clear.
I’ve put some details below. Can you pls. check especially if I’m right concerning what I call the “external” packages-libraries and the “internal” library gathering the modules written to support the main.ml file?

Details

We specify in a foo.opam file what is our program and all its dependencies (by hand, from scratch or from a template). This file is used by Opam for installing all the libraries that our program needs.

$ cat project2.opam
opam-version
name
version
synopsis
...
depends: [
   "p1" 
   "p2" 
   "p3" 
   "p4"
]
build, install
dev-repo
url
descr
# values for fields other than _depends_ are intentionally left blank

For writing valid Dune files, now this is clear. Dune needs at least 2 simple dune files. AFAIK, we can package our modules within more than one library ; they will be available as soon as they are correctly declared in the dune file of the main module ; maybe there are Dune features for handling nested libraries. We’ll see that later…)

$ cat src/dune  # or bin/dune or blah/dune... 
(executable
  (name main)
  (libraries p1 p2 p3 p4 foo))  
# "external" libraries from other people  (p1 p2 p3 p4) + "internal" library for this program  
# foo or lib or blah as desired naming for the "internal" library gathering the modules of our program

and

$ cat lib/dune
(library
  (name foo))
# foo or lib or blah as desired
# The foo library holds all my internal modules on which depends my main module

So we must specify the program name and its “external” dependencies two times in two different places as required by the tools:
1/ for Opam: in an opam file (/project2/opam or /project2/project2.opam), with a 1 package-1 library mapping enforced by Opam to avoid problems.
2/ for Dune: in a pair of dune files

Does it exist a utility program for populating the Opam file from the two dune files ? (as they focus on the executable)

I agree that once s.o. knows the details you exposed, he just has to fine tune his three configuration files (for the package and build systems). It’s more simple than complex configuration I could encounter elsewhere.
In fact this is true when you have time and are in a quiet environment with not many programs/projects.
But when you are in a hurry and working on several projects/programs in parallel, or distracted, you are permanently and uselessly exposed to small configuration errors that will prevent you to build correctly and make you lose time and patience. Think also about the errors in your code and the errors from some tools that can appear altogether…
That’s why I feel that there should be ONE specification of the program with all its dependencies, and a generation of ONE hierarchy with the Opam and Dune configuration files. And the Opam package management and Dune build systems should adapt to this. When this ONE specification is changed, the stuff should be updated with a clear information to the developer.
This is my feeling based on my quite small experience with OCaml tools. But the amount of time I’ve already spent to understand that (with your help) is a good indicator of the maturity of the documentation and of these tools.

I’ll have a look at topkg. Thanks.

That looks nice. Thanks.
However there is disclaimer:

Tradeoffs
esy-peasy is good for rapidly making new small executables/libraries. Once they grow, you’ll want to “eject out” of esy-peasy and begin customizing using a more advanced build system.

For working on reliable industrial applications that are becoming more and more complex, I feel that Dune is probably more promising.
Let’s see what the Dune team will answer here about that!

This setup (pesy + esy) uses dune.

esy is an alternative workflow which supports installing and building opam packages and which has local switches by default with a global immutable build cache (a-la Nix).

pesy is a configurator for dune builds.

3 Likes

I usually do a lot of scala in my free time. I’ve been messing around with a lot of other languages including haskell, rust and crystal this weekend. I did a small F# demo a few months ago and know it’s based off of ocaml and wanted to try that out.

I cannot speak for the whole dune team, but I use the following layout for my projects:

 ./
 |
 +-- LICENSE.md
 +-- CHANGES.md
 +-- README.md
 +-- lib/
 |   +-- dune
 |   +-- a.ml
 |   +-- a.mli
 |   +-- b.ml
 |   +-- b.mli
 +-- bin/
 |   +-- dune
 |   +-- some_program.ml
 +-- test/
 |   +-- dune
 |   +-- test_all.ml
 |   +-- test_a.ml
 |   +-- test_a.mli
 |   +-- test_b.ml
 |   +-- test_b.mli
 +-- dune-project
 +-- myproject.opam
 +-- .travis.yml

The three root markdown files are understood by odig. It makes it possible to globally query the changelog, license or readme of an installed packages. For CHANGES.md, I recommend following https://keepachangelog.com/. There’s no precise format, but it’s a set of guidelines that are worth following.

lib/dune declares a (library) with the external dependencies like lwt, containers or whichever libraries you are using.

bin/dune declares an (executable) such as a client binary for your library. Not all projects will have this. It usually depends only on your library, and maybe on things like cmdliner.

test/dune declares a (test). It depends on your library and your test dependencies, such as oUnit. For unit tests, I use one test module per library module, and export a single value val suite : OUnit2.test in each. Having explicit almost-empty mli files help find test code that’s written but not attached to a test suite. If I have some integration tests, I put it in a separate directory, e.g. test/unit/ & test/integration/. The structure of integration tests is usually a bit more free-form.

dune-project just sets some global options. I like to keep it vanilla and just put the language version here, but some people use it to globally set some configuration options, for example to globally disable some warnings using (env).

The opam file has info about webpage, maintainer, etc. It’s unfortunate but for now we have to copy the external library dependencies here. It’s here that you can specify version bounds on them, e.g. if you need features only introduced in lwt 4.0.0.

For .travis.yml, I use the configuration from ocaml-ci-scripts. It is enough to test the latest release of each minor compiler release (e.g. test 4.04.2 and 4.05.0, but not 4.04.1). It used to be a viable strategy to test them all from 4.02 onwards, but this is a bit long these days. Starting at 4.05 is a good idea, as it covers the next Debian release. 4.01 and older are difficult to support as there are no extension points, so for example it is impossible to deprecate values, etc. If I have some C stubs I like to include a build on a different distro, like Alpine vs Debian. It might be useful to add a bytecode only build or a 32 bit build depending on what the project is doing, but generally speaking 4.05 to the latest version are good enough.

Some projects have several packages. For example it is common to have a pure version and a lwt part dealing with IO. In that case, I like adding a lib_lwt directory which defines a (library) depending on the pure part and lwt. This will correspond to a second opam file such as myproject-lwt.opam. You can do releases to opam using dune-release.

Having a tool to generate these structures would be useful, and I know that several people are working on such tools. While dune would be a nice place to add this, there are so many different ways to do this that for now the dune team would prefer to have an external tool deal with this for now, like pesy is doing, or like create-react-app for react.

12 Likes

It’s a matter of taste, but I really dislike the lib and bin style. I just put stuff in src (each binary/library under its own subdirectory in the case there are several of these). An example would be https://github.com/c-cube/qcheck (with the non standard examples/ instead of tests because in this particular case, tests are supposed to fail… hum).

3 Likes

I don’t think there is a strong technical reason to prefer one over the other since Dune is very flexible in finding the necessary parts. It is more about what supports maintenance better and this may differ across projects. But both layouts agree in a lot of points already:

  • keep sources in sub directories
  • keep library sources in their own directories (this is somewhat enforced by Dune)
  • keep C stub files the their own directories (enforced by Dune)
  • keep *.opam at the top level
  • keep administrative content at the top level
3 Likes

Yes, that seems to be the key part. In particular, in most cases you can move around your directories without having to edit any dune files, so the directory structure is completely cosmetic.

4 Likes