Ocaml stdlib and death by a thousand papercuts

OCamlverse would love to have more contributions, btw.

These stdlib discussions always remind me discussions about OCaml’s supposedly bad Unicode support. When you actually go out and see what other language do, you mostly find either broken support or basically the same representation OCaml has (i.e. strings of bytes). Yet everyone will gladly repeat over and over on the interwebs that OCaml has a problem with Unicode and is thus a no go.

Now go out and try to find out a popular language with the holisitic library support some people in this thread assume is needed for OCaml. JavaScript :joy: ?

All this goes back to myths and perpetuating them. As @cemerick cued in his first message, if you want to do a favour to newcomers stop perpetuating the myth that the first thing you need if you want to program in OCaml is to look for an alternate stdlib (and that it should be the one you chose).

This myth doesn’t hold, especially with all the additions that happened since at least 4.08 – my helper functions and modules are shrinking at an alarming rate from versions to versions. The current stdlib will bring you a long way with excellent stability properties, that’s the fact to communicate to newcomers.

And for those who actually want OCaml to become popular, rather than inventing committees and new bureaucracy around package management. Use the language to build real applications, tell the world it’s written in OCaml and write postmortems to complain about what worked and what didn’t, that’s the best way to move things forward.

47 Likes

This is so critical! There is pressure everywhere in our current systems to buy into the cancerous “growth over all else” mindset, and seeing others committed to trying to instead discover a sustainable course is really refreshing.

5 Likes

Just to echo @dbuenzli’s sentiment, additions to the standard library are made frequently. Look for “standard library” in the changelog. It’s very common for me to go over some mature code base and find “utility” functions that have been obviated by additions to the standard library. If you’ve implemented some useful routine over and over again in your projects, and you think it would fit in the standard library, I encourage you to propose it!

As far as curating a “blessed” list of packages, I don’t really see the appeal. The best way to endorse a package is to use it. When I am choosing a package, and there are multiple options, unless there is a huge difference in quality or performance, I want to use the one that everyone else is using. I could have sworn that opam.ocaml.org used to show the number of times a package was used, but I no longer see it.

And how would you know to use that specific package over others?

I could have sworn that opam.ocaml.org used to show the number of times a package was used, but I no longer see it.

Yes, I remember that too, seems removed now. I agree that adding a ‘popularity’ metric or a rating system to the package index [EDIT: number of GitHub stars on the repo would be a reasonable first step here] would be a good step in the direction of marketing the ecosystem of packages better. I believe there have also been other suggestions, like adding a ‘tag’ feature so that similar packages can be grouped together and then compared.

1 Like

We pulled the dynamic logs from the opam site due to GDPR compliance reasons, but also because they were largely inaccurate (hit counts were heavily skewed by CI and direct fetching of archives).

However, if you scroll down through https://v3.ocaml.org/packages, you’ll see “most used packages” as a new metric that is calculated from solely from the package metadata in opam-repository and gives you a dependency count. This is a much more predictable way of figuring out which packages are central in the ecosystem. (One improvement I’d like to see there is a clickthrough to the top 100 packages, or even a d3.js cluster visualisation of the popular dependency graphs).

Daniel, I think I’ll frame this and put it up on my desk.

12 Likes

I agree that some of the function you say are lacking would be great to have. Maybe you should program and contribute them ! I also have a few function I believe would be nice (inplace_map or filter for the array module for instance). I should also program them and try and commit them.

Hey dbuenzli. I really respect your work and activity in this community. It’s an awesome force. Nonetheless, this remark struck me a bit. Never have I started a project and thought “what stdlib do I need”. Instead, I start from a near empty state, and iteratively add logic and libs as I go. Eventually, a stdlib alt makes it into my dependencies, and I’ve been around long enough to have evaluated all of the options to have a strong personal preference. Never has such a library made it into my opam lock because I think “i need a replacement”. The library makes it into my dependencies because I think, “i want powerful tools and functions, and don’t want to write them on my own”.

No myths, just desire for productivity.

Every time I write a tiny bit logic, I follow a mini version of the engineering process.

  • Define the problem.
    • Is this a map reduce problem? Set arithmetic? Bin packing? Search problem?
  • Explore/research
    • What tools are at my disposal? Of course, consult the standard library first.
    • :bomb: Dang, I cannot find a function with enough specificity for my use case. Certainly I could write the function that I seek using stdlib tools. But then my own API surface area goes up, my testing responsibility goes up, etc.

I wager no one is crying wolf because of myth. People are let down because expected powerful tools are missing. Plain and simple.

I was going to enumerate some use cases, but seeing the names in this thread, I don’t think that would be necessary. A simple diff of function names between CCArray and Array or CCList and List I think would indicate that… ya, the resultant set of functions are pretty basic functions that any developer may find useful in general purpose programming.

Edit: here’s CCList - List - CCList's_Operators: CCList minus List · GitHub

I cannot go 10 functions without hitting an occurrence of “yes, used that recently”. I’d subjectively argue that that’s a pretty high hit rate! take, drop, find_mapi, all of the _opt goodies? This stuff is golden.

11 Likes

Just wanted to second this. The lack of a built-in polymorphic print function was by far my biggest frustration when I started using OCaml. Having a basic ppx available for debug printing would have been a huge help.

4 Likes

You want a Stdlib (4.12.0) - Containers - Batteries & Base comparison?
Give a look at: ocaml-stdlib-containers-batteries-base-comparisons/ALL.pdf at main ¡ Fourchaux/ocaml-stdlib-containers-batteries-base-comparisons ¡ GitHub and :cry:

3 Likes

Two responses:

  1. The vast majority of programmer time is not spent writing new code: it is spent debugging and maintaining old code. The most expensive resource is not “programmers who can write new lines of code” but “programmers who can figure out what the hell is going on with this mess that’s caught fire and destroying my business.”

  2. OK, a real story of a “big standard library” and the problems it can cause.

Java 1.1 had a class, StringTokenizer. This class had some behaviour, let’s not worry about what it was. There was a particular behaviour that was deemed a bug. So in Java 1.2 that bug was fixed. Thing is, major enterprise Java products depended on the 1.1 behaviour.[1] So when they upgraded to Java 1.2, they silently broke in interesting ways that required quite a bit of time to debug. And since Java 1.2 was one big piece, it wasn’t possible to use any sort of “revert this one module and see if the problems go away” to debug the problem.

One of the important things that comes with having major libraries outside of your language runtime/compiler/stdlib, is that you can expect (and typically do get) that maintainers will keep their libraries working across a range of language release-versions, so that when there are unexpected issues, you can use bisection on individual dependencies to track down the general area of the problem without using extremely valuable debugger resources.

[1] I purposely didn’t describe the “bug”, b/c to some, the original behaviour was “working as designed/desired”, where for others, the new “fixed” behaviour was that. It all depends on where you sit.

1 Like

dear @cdaringe,

the setup experience seems most relevant to building prototypes rather than decades of production, isn’t it? Here may stem the different viewpoints.

I once saw web-designers shake their heads about objective c when it’s so much faster to get something running with node/js. They didn’t care about maintenance or reliability or feb 29th or y2k or umlauts. (They did about emojis, however). They wanted to deliver and forget about it.

1 Like

I would submit that multiple things can be true at the same time. I’ll use my personal experience to illustrate.

Upon (re)investigating OCaml a few years ago (IIRC 4.06 was the first release I pulled down), I absolutely read many forum posts, tweets, irc/discord chatter that, in short, Stdlib (Pervasives, then) was old/abandoned/too minimal to be useful/whatever, that choosing a go-to “replacement” was a reasonable prerequisite to being productive, and that it was lamentable that e.g. core or base wasn’t simply adopted wholesale, etc. My perception of this narrative wasn’t helped by my concurrent explorations of Reason at the time, which also had one or two stdlib alternatives that were widely recommended.

In hindsight:

  1. all of those claims were overstated; obviously one can do very good work without using base or core or containers or whatever, as long as you can navigate and utilize the library ecosystem (much like any other language, really). I’d say that the great work of the platform contributors has made crossing that bar much easier in the interim.
  2. it’s absolutely true that Stdlib is, in general, quite minimal compared to its plausible replacements. OCaml is very scheme-like in that sense, for better or worse, and everyone will have their own bugbear about what obviously essential thing Stdlib is missing. I remember being quite surprised that Result wasn’t available; obviously a gap that’s been filled since, but at the time, its absence made it easier to internalize the “stdlib is bad” hyperbole for a while.
6 Likes

Both things can be true:

  1. The stdlib has gotten much better within the last couple of years as the attitude towards its expansion has changed.
  2. There’s still a very large gap to fill to reach parity with the stdlib and out-of-the-box experience of other languages. Sure, you (the generic you) may find every library you need, but the newbie trying out OCaml may give up and go back to another languages i.e. it’s a point of friction. Every little point of friction adds up and causes the language to be adopted less.

That’s why I don’t hesitate to use Containers and to recommend it to newcomers. Could I/they get along with just the stdlib? Certainly. But why bother writing a possibly buggy implementation of a utility when it may already be available in Containers, with possibly better abstractions? Similarly, having a community-blessed set of packages means that we can agree on the basic types and the basic API we use for basic functionality.

4 Likes

My experience with OCaml was actually to do a lot of programming in F# first to avoid this problem, before coming back. F# is very similar to OCaml in many ways but has a better standard library and far better third party libraries. It is also way easier to use (especially on Windows) with Visual Studio. After programming in F# for a long time and becoming familiar with the language, I switch to OCaml as learning it was easier despite the poor ecosystem because of my F# experience. I took the approach of writing my own standard library as I felt experienced enough even just being new to the language for that reason.

While I agree that the stats were probably not the most accurate, they still had value. For example my junit packages were in the top 10 of most downloaded packages at some point. But they have no rev deps on opam. Without the downloads counter I wouldn’t be aware that it is worth putting efforts into maintaining my package.

10 Likes

It’s very easy to create a set of packages to be installed together as a collection via dependency only opam package. I have one that I use for many of my own projects: Shon Feder / switch-plate · GitLab I guess those are the packages that I currently “bless” :slight_smile:

Perhaps another route towards this kind of thing would be to contribute a templates to spin or drom (in case you like those tools)?

I could see a future where users are able to select from preconfigured library/utility sets, which are tuned to specific use cases (e.g., scientific computing, web-dev, CLI tools, etc.).

6 Likes