What are some libraries you almost always use?

leostera · January 26, 2021, 8:31am

Hello fellow ocamlers ,

I’m putting together some ideas for the standard library that will ship with Caramel, an OCaml-derived language for the Erlang VM that I’m building, and would love to hear what are some libraries that you normally use.

I’ll start with just a handful:

dbuenzli stuff: BOS, Cmdliner, Fpath, and Logs are almost always in my projects
JaneStreet’s sexplib always finds its way there
deriving_ppx

Which ones do you go for?

PS: If you’re interested, here’s the issue where I’m collecting thoughts on what the standard library should look like. It is definitely a larger stdlib than OCaml ships with, and that’s sort of the point.

hcarty · January 26, 2021, 3:11pm

In addition to exactly the handful you mentioned @leostera, time and timing libraries like dbuenzli’s ptime and mtime libraries are fairly common low level libraries in a lot code I’ve written.

lindig · January 26, 2021, 3:53pm

I would consider establishing Containers as part of a standard library.

c-cube · January 26, 2021, 4:07pm

Would it make sense, instead of reusing OCaml libraries, to write
bindings to the existing Erlang/OTP libraries? I imagine that reusing
existing erlang or Elixir libraries would be convenient for writing the
kind of server the Beam is typically used for.

leostera · January 26, 2021, 4:18pm

@hcarty good point!

@lindig containers is a good inspiration, thanks!

@c-cube yup! The vast majority of the Stdlib will be bindings to existing Erlang/Elixir code - but the idea here is to learn from existing OCaml API design to learn from the efforts that have gone into it.

As an example, @yawaramin recently pointed out that ocaml-decimal would be good to have. In this case, Erlang already has BigInt support, so it doesn’t really need zarith, so we can borrow the API design and provide an implementation of big decimals that just runs on top of Erlang’s native ints.

Hope this clarifies things a bit more!

Chet_Murthy · January 26, 2021, 7:18pm

“fmt”, “cppo”, “ocamlfind”, “pcre”.

There’s a (ahem) healthy and vigorous debate about whether it’s a good thing to put all these libraries into the core of a language. Different languages do it different ways. A certain amount of coordination is important (in any case) between the maintainers of the language, and some set of third-party-maintained libraries.

bobot · January 27, 2021, 9:28am

In addition to Cmdliner and Fmt, Base is always my first dependency in my projects.

shonfeder · January 27, 2021, 11:11am

Any interesting project needs tests, and qcheck is my favorite way to test. I think it’d be neat if a standard library encouraged property based testing (with perhaps a section on using constant generators for unit testing, as a degenerate case of PBT).

Expect tests or cram style tests are a nice informal complement.

lindig · January 27, 2021, 11:25am

Since moving to dune, I have no use for ocamlfind; what makes it indispensable for you? pcre binds C libraries - I’d prefer re because it is a pure OCaml package and hence likely to be more portable.

yawaramin · January 27, 2021, 2:13pm

Qcheck is remarkably easy to port, btw. I ported qcheck-core to BuckleScript once with very minimal changes.

rgrinberg · January 27, 2021, 7:24pm

Menhir. There’s no parser generator that compares to it.
re. Regular expressions without a weird syntax and pathological performance edge cases.
Markus Mottl’s numerical libraries (lacaml, gsl). Highly stable and a great resource for understanding how to write C bindings.
Jane street’s libraries deserve a special mention. They’re rarely usable in practice because of portability and dependency concerns, but they’re the best reference when one needs to figure out how to do something correctly and efficiently in OCaml.

Chet_Murthy · January 27, 2021, 10:27pm

Ah. So: I don’t use Dune, for three reasons:

I’ve already got a bunch of Makefile-based projects, and they work fine, really fine. I see no gain in dune
Dune is “opaque”: I’ve seen too many questions “why is dune doing this?” for me to feel comfortable with it.
I build projects from time-to-time with significant C/C++ components: Dune doesn’t support these well – frankly, nothing will other than Make.
I wrote a little “wrapper” for ocamlfind called “not-ocamlfind” that provides a little extra function, and with that, building multi-directory projects with Make is actually really sweet. To wit, each directory’s makefile installs its final product into a “local-install” directory at the top of the project, but using a “reinstall-if-diff” operation. During “make depend” processing, it looks at the relevant packages in “local-install” and puts in dependencies on the META files in those packages.

The effect is that, when you have a nontrivial graph of directories-and-dependences, you just run thru them doing “make” (from your toplevel makefile) and only those directories that depend on other directories that got recomipiled, will need to be recompiled.

What I’m trying to say is: the “composition model” of findlib packages, is equally applicable to the internal organization of a large project into subdirectories and sub-packages. As an added plus, when writing tests you just assume the packages being tested are already installed, and findlib will find them. So writing/running tests -inside- a project, is the same as doing so -outside- the project.

But whatever: I get that people like dune. It’s all good.

mjambon · January 29, 2021, 12:41am

I use cmdliner for user-facing executables, but there’s no other library that I use systematically. The other tools I use systematically are not libraries (ocamlfind, opam, dune, …).

UnixJunkie · January 29, 2021, 3:03am

I use all those very often: batteries, dolog, minicli, parany.

dbuenzli · January 29, 2021, 9:38am

Since many people mentioned it, I’d just like to say that looking at cmdliner’s API makes me feel slightly nauseous these days.

A separate redesign would be a good idea but somehow that does not put food on the table and I still manage to cope with the current one.

Also, refined and better designs of the other libraries you mention can be found here.

yawaramin · January 29, 2021, 2:51pm

Cmdliner’s applicative style looks really nice with let operators though.

mjambon · January 29, 2021, 10:26pm

Some feedback on cmdliner from a late adopter (I know it’s off-topic, but we’ll be ok):

I don’t understand the cmdliner type machinery, but I get by by using past projects as templates and sources of examples aka copy-paste. Also, I avoid using & completely (I forgot what it does except that it’s trivial).

Once I have a template for a CLI implementation, most of the time is spent figuring out how to express the type/format of a command-line argument. Maybe a cheatsheet would be beneficial.

The feature-completeness and the output of cmdliner are amazing.

mjambon · January 31, 2021, 10:51am

I forgot to mention alcotest, for unit tests. I’m a very happy user.

dbuenzli · February 15, 2021, 11:14am

Since @leostera gave his interpretation of this here. I’ll make that statement more precise as I find myself quite in disagreement with what he wrote under what I find to be the wrong lenses (or rules) to assess the situation.

I won’t comment on the docs aspect since I don’t find cmdliner’s docs to be particularly good – nor particularly bad either but I would certainly write them differently now that the feature set of the library has changed (more on this below).

However I’d like to address two impressions that come out of that article before they start to become myths. These are:

The API is hard to use.
You need to understand what an applicative is to be able to use the library and that’s a high bar for usage.

Regarding 1. I don’t think the API is hard to use. As far as I’m concerned a hard to use API would be an API that allows you to easily shoot yourself in the foot, that makes it hard to define, tweak or evolve your command lines, or makes it hard to understand what is going on when you come back to the code or when you have a bug. I personally find none of that to be true.

Regarding 2. You absolutely do not need to understand what an applicative is to be able to use the library. With the time I even evicted that fact from documentation and renamed Term.pure to Term.const. I think that any working OCaml programmer should be able to start from the short basics and gradually tweak that example to get to what s.he needs without ever having to understand what an applicative is by simply following the type mechanics.

So if it’s not 1. what it is ? One of the reasons why a lot of cmdliner code gets cut and pasted to be modified is that the API induces a lot boilerplate and that it became slightly messier over the years. Here are few reasons for why this is the case:

It started simply as a cli parsing library. However over time, it gradually evolved towards an “os process” interface library. Environment variable lookup was integrated as well as a formalization of program exits and their documentation. This was bolted on top of the API without breaking it. This means that the current API is unlikely to be the best way of structuring and exposing the feature set. That of course hampers its usability.
It failed to capture one important pattern that became widespread as command line
tools grew in complexity over the last decade which is to have specific command lines syntaxes for tool object verb not just tool cmd. Many people, myself included, ended up manually encoding this pattern in an unsatisfactory manner too many times. (This PR is meant to fix that, but that may not help with streamlining the API).
The Arg.t to Term.t mechanics could likely be streamlined by using less applications (in fact the new design I have in my head eschews it entirely).
A few defaults that became clearer as more and more cmdliner programs have been written could likely be changed and/or integrated to cut on some of the boilerplate.
The library was written 10 years ago and OCaml and its stdlib were different. Newcomers often evaluate program sources and designs with respect to the current state of the art rather than in the light of the era in which they were created; that leads to misunderstandings. For example there was no result type in the stdlib. Its integration both at the Arg.conv and Term level was added later, perhaps hastily, which muddled the API; nowadays it would certainly play a central role in the design, both for parsing and managing exits. Another example is the & operator which confuses a lot people but would nowadays simply not exist since @@ does, or not be needed at all (see 3.).

Finally to come back to the use of applicative which in the particular case of cmdliner was discovered, not applied. I have seen many people over time claiming that this was non-obvious, needlessly bureaucratic for “just parsing an array of string” or that what they wanted is just “simple” and direct Arg-like mutations to get their parse result (these people should be forced to go work on the ocaml drivers…). These thoughts largely miss the point in my opinion. I think the current design is a good one because:

It scales. In a term you can encapsulate non-trivial cli interaction that eventually defines a whole immutable datastructure.
It is composable. Your libraries can expose cli interaction terms to be reused.
It precisely avoids the need for reference cells or mutations. These become tempting global mutable state or enable easy to do but hard to understand effectful contorsions; so it’s better to avoid them.

These points encourage you to have a good program structure where you define data structures and algorithms, and, cleanly separated from them, the cli interface and its logic to expose them to the shell.

leostera · February 16, 2021, 11:20am

@dbuenzli thanks for taking the time to answer!

I think we should encourage more public discussions like this since we all can learn a lot from each other. Especially because I think your contributions have time and again forwarded the OCaml platform

I’ll say tho that the Rambling Machines mailing list is meant as a place for me to share less-polished ideas, with as little editing as possible, in contrast to the essays I’ve got on my website.

I even encourage replies like this one!

With that out of the way, on to your points!

before they start to become myths

I don’t think I’m starting any myths here. I’ve heard this from a lot of people coming from the Reason world, for well over 3 years now.

Regarding 1. I don’t think the API is hard to use

Hard and easy are completely subjective and rely on your body of knowledge and experience. If you asked me how hard it is to contribute to Caramel, I’d say its super easy! But also, I wrote it, and I’ve become more familiar over time with the OCaml compilation toolchain, and AST traversals, etc. “Easy” is earned.

Sometimes a “hard api” is the one that only lets you do the right thing, or impose usage patterns that take a lot of time to figure out, or doesn’t use the metaphors you’re used to. This is, again, entirely subjective.

If this sounds confusing its because when people say easy or hard they mean very different things.

Sometimes in the same sentence.

Regarding 2. You absolutely do not need to understand what an applicative is to be able to use the library.

I absolutely agree!

What I meant with “actually understand how to use cmdliner” (emphasis as in the newsletter issue) is that to be able to think clearly about what the code is doing, you need an understanding of applicatives. Maybe only intuitively. That’s what the “actually” stands for in that sentence. Lack of editing gets you this lack of nuances so I’ll agree with you that this could have been written more clearly.

Of course you can use cmdliner and get a cli running, but that doesn’t mean you understand how it does it – just like I don’t have the palest clue how my inductive stove works, and I still can make food.

Here are few reasons for why this is the case:

This is good background story to understand the current state of things. Thanks for sharing!

These thoughts largely miss the point in my opinion

Perhaps! But the points you focused on when designing and evolving Cmdliner left gaps, by choice or accident, that I tried to highlight. Some of those gaps exist there because you are a proficient ocaml programmer focused on building composable and scalable libraries.

Which is why I really like that you took the time to reply and make your focus clearer as well.

Anyway, I’d be happy to continue this chat and explore some of your new ideas for the API

Maybe we can find ways to make it composable and scalable while remaining intuitive and convenient to a larger part of the ecosystem.

Topic		Replies	Views
Top 5 Favorite OCaml Libraries? Ecosystem community , learning , library , learn-ocaml	19	2973	October 20, 2022
What is the preferable solution for the role of standard library? Learning core , standardlibrary	37	9463	December 22, 2017
Staying up to speed with OCaml in the year 2021 Learning	13	2016	February 6, 2021
Ocaml stdlib and death by a thousand papercuts Ecosystem	89	5737	January 2, 2025
Modern Standard Library Documentation Ecosystem documentation , standardlibrary	14	1914	November 14, 2018

What are some libraries you almost always use?

Related topics