OCaml future development

I see the appeal, but I would tend to only fold in libraries that really are universally used. It would be nice to see lwt/async folded in, maybe a unit test library. My best guess is maybe 40% of projects might use http and json.

When you installed golang, the -packager- provided a package that contained all these libraries you reference. The packager’s task was easy, because the golang source distribution comes with all these libraries.

By contrast, the ocaml distribution does not, agreed. But packagers -could- package those add-on libraries, so that you could easily apt-get install them (or the equivalent on whatever OS platform you use).

The problem here is one of providing binary packages, not so much what is “built-in” to the language distribution.

I don’t really care about Go I have used it just an example. Even Java since 11 has built in http client. I don’t need to install it or apt-get something to make it working. It is part of Java stdlib.

There is a process for bringing new modules into stdlib, as evinced by a recent thread: Adding Dynamic Arrays / Vectors to StdLib

Asking for a “batteries-included” stdlib akin to those provided with Go or Python seems like a non sequitur. The OCaml maintainers have quite explicitly chosen a minimalist library strategy, allowing/encouraging users to find the best solutions to problems that aren’t essential to the language’s implementation. (To wit, even what might be considered a “low level” detail like lightweight concurrency is serviced by popular community libraries, lwt and Async.) Asking or hoping for a change in this disposition is pretty unrealistic, and would likely disrupt those that have made bets on OCaml because of its approach in this regard.

To be clear, none of this precludes the development of libraries (or suites of them, constituting a framework for your domain(s) of interest) becoming popular enough to be de rigueur standards (viz. as Rails was/is for Ruby). I think hoping for progress there is more realistic, and quite desirable.

6 Likes

Java is a great example of what NOT to do. The process for getting your module incorporated into the JDK is a very political one, and questions of quality are almost irrelevant. This would be a very bad precedent to follow.

Look: does it bother you that the distribution for GCC doesn’t come with glibc? When you install a binary version of gcc, it prereqs and installs a binary version of a compatible glibc automatically, and you didn’t have to do anything to make that happen. How does that happen? The answer is: the -packagers- of gcc ad glibc arrange the dependencies so that this happens. And this is an unalloyed good thing because it means that you can substitute your own glibc if you wish. By contrast, it’s actually pretty difficult to convince the JVM to let you substitute your own version of some built-in class-library.

There’s a conflation of “it comes with the compiler” and “it’s built along with the compiler” and “its sources are distributed with the compiler’s sources” and “it can be automatically installed with the compiler”. These are different things, and there are at least some good reasons to keep these distinct.

7 Likes

Java is a great example of the tradeoffs involved in what kind of stdlib a language should have.

Java has always had an HTTP client. Prior to Java 11 (that is, for…25 years?), it was incredibly unpleasant to work with, had serious security vulnerabilities, and consistently lagged commonplace standards by years. Everyone used alternatives, except for the smallest/messiest of projects. That situation has now improved, but many will continue to use alternatives, since even the new HTTP client is lacking in various features and can be expected to lag security and standards compliance, as it is in a necessarily slower-moving standard library. Meanwhile, the JDK will continue to retain all of the APIs for the original HTTP client, as backwards compatibility demands.

8 Likes

I’m not saying that if there will be an http client in OCaml batteries we should get rid of all third party http clients. My point is http client out of the box can cover 99% of the cases and I trust module from stdlib because the mantainer of third party library can pass it to someone else and they will build in some crypto scam or something like it happened with some npm module. Or we will end up with something like leftpad.
“OCaml is an industrial strength programming language“ quote from ocaml.org but how can I introduce OCaml to my company colleges if to built something simple I will need to download half of the internet? I will be asked by security people “Who built all those dependencies?” and so on.

2 Likes

This is an excellent point. The inclusion of add-ons in a standard library is a one-way process and very difficult to undo. It can and does serve as a drag on the core development team, and this was the case in Java.

The problem here is one of packaging and delivery, not one of what is and is not included with the compiler distribution. And the opam team is working on that packaging problem.

1 Like

No doubt, “building something simple” with OCaml can be more work than in e.g. node if the essential pieces are pieces of plumbing to e.g. HTTP network services. That might be a common domain, but it is a very narrow one that can be readily addressed with a couple of libraries. Hope springs eternal that decisions as weighty as language choice won’t be made on the basis of “do I need 0 or 2 dependencies to write this small demo?”

Third-party dependencies obtained via some kind of package manager (either system-level or opam/npm/pip/mvn, whatever) are a fact of life, regardless of language choice. I assume the security people would apply the same standards and practices they currently apply to Python or Java or JS projects.

For the truly paranoid (or paranoid-employed) that must live behind air gaps, I suspect there are ways to capture all of opam’s repository so that you won’t have to download any of the internet…but I suspect that that’s not the objective here?

9 Likes

There is not much to apply when projects relie on stdlib, internal dependencies or libraries maintained by companies like Amazon and so on.

1 Like

I do think this may have been some of the original objective behind the OCaml Platform, much like the Haskell Platform (though I could be wrong). This is something that’s taken a long time to come together, but currently (as of the last OCaml workshop) the Platform is only focusing on must-have tools needed to participate in the ecosystem, such as dune and opam. Eventually perhaps it could have a set of curated libraries from the ecosystem packaged together with the compiler and standard library.

In general, moving a library from the ecosystem to the standard library has the effect of slowing down its development to match the speed of the compiler’s development cycles. Developing outside of the standard library is far more fruitful.

4 Likes

It should be ‘fairly easy’ to curate a set of popular OCaml libraries in an Opam package for people who want a ‘batteries-included’ library like Go’s. It would just be a meta-package. As such, the ‘only’ work required would be testing that its dependencies work together, and keeping it up-to-date.

It can be just a standard package on Opam. No need to distribute with OCaml.

12 Likes

In any case, I think that most of the main libraries are already backed by industry or a big project (think of janestreeet, ocamlpro, mirage, …).

Besides, for industrial use, I would expect to see a curated internal opam repository for the production code and not a direct use of the upstream one.
One can keep track of the updates on opam with a cron test, check the updates and test them in a separate branch, and only then merge them. No matter who maintains them. This should be enough to avoid leftpad situations or malign modifications of the code with rather minor efforts once the system is in place.

4 Likes

If I were a tech lead and someone approached me about using a new language in production, my biggest concern would be how many libraries are solid and ready for what I need and how many will I have to enhance, create. Is there a good AWS API, is there a good tracing library, etc, for whatever emerging technologies that I need to interface with? If I felt like that gap wasn’t too large, I wouldn’t be too concerned about other factors.

4 Likes

Personally I agree that it is nice when an ecosystem manages to provide a “batteries included” experience with one centralized go-to place that covers most usual needs. It does come with costs (Python finds it difficult to maintain and evolve its standard library, we will see how Go fares over time), but it has many advantages.

I think that the main reason we don’t have this is OCaml today is that it is very hard to achieve. The amount of effort, coordination, development you need to provide this is huge, and no one succeeded in doing this. In the late 200s and early 2010s I participed to the then-just-starting “Batteries included” effort, that was a community effort to do exactly this, we ran into a lot of trouble due to the sheer size and complexity of the goal, and eventually we had to slim it down to what is now the “Batteries” library, a standard-library replacement that is nowhere this ambitious. I’m less familiar with it, but I think that JaneStreet’s Core library is in many ways a similar attempt to “bring everything you need at first”, and despite the considerable resources put into it, I don’t think it cover this needs as well as Python’s or Go standard library packages. (It also does very well as a standard-library replacement, and it also later gave birth to a slimmed-down version, Base, to avoid growth and portability issues.)

Some people in the community want to get back to such an effort, and I think that it’s great that they try, but I don’t really believe in it. I would rather focus on admitting that we, indeed, have a fragmented ecosystem of smaller libraries maintained by different people, and try to build from that outside the idea of merging them in a coherent library/framework/collection. We could:

  • set consistent quality standards for hosting, documentation etc.
  • ensure that the important pieces of the ecosystem are maintained in a healthy way (possibly by providing some basic financial support)
  • in addition to the ecosystem-wide build QA that we have thanks to the opam-repository CI, move further towards community-wide runtime tests, benchmarks, etc.

These efforts would converge towards something that is quite different from Python or Go cathedral of a standard library, more like Perl’s CPAN (or LaTeX’s CTAN) repository of packages. I don’t have a strong opinion on which of the two models is better, but I think that the latter one is a more realistic hope for the OCaml ecosystem, and I sort of like that it works by weaving together many small pieces from different people, without requiring centralized ownership.

26 Likes

Somehow this can be inferred between the lines of what has been said in this thread but I think it is worth making more explicit:

The batteries included of today are unlikely to be those tomorrow – for reasonable or pointless reasons – but your programming language, being Turing complete and rooted in logic, will likely stand the test of time.

With respect to this observation I prefer a “CPAN” model.

However this doesn’t answer the question of what a language standard library should really provide.

My take on this is that it should provide the minimal infrastructure to allow the batteries included of today and tomorrow to be defined in a lean, interoperable and distributed way outside of the language distribution – assuming a minimal amount of cooperation among the external actors.

This means providing datatypes and essential functions so that you don’t end with two different concurrency monads and three different modules of string convenience functions when you try to build an application with the bazaar.

I’m not saying the line is easy to draw but I think this broad definition still allows to guide some decisions (e.g. rather than the language/stdlib providing support and enshrining a specific serialization format it’s likely a better idea to provide intrastructure for the serialization problem).

12 Likes

That’s an excellent example! B/c as it turns out, Java did precisely pick a serialization format, and then that format turned out to not be particularly well-chosen, and the implementations in various JDKs were laughably inefficient, etc, etc. Maybe it’s all been cleared-up, but as recently as 15yr ago, it was a bloody mess. Partially because the gatekeepers of “the language” chose sides in this.

It could be the case that the Ocaml community is not a community that has been moving towards a standard way to solve problems like the Python community. You have the compiler folks kind of doing their own thing. And you have a bunch of developers that are here despite the lack of all this standard, so it’s clearly not super valuable to them. For myself, while the lack of libraries can be frustrating at times, I’m generally using Ocaml because the language lets me create all of that fairly quickly and correctly relative to other things, so I end up re-implementing a lot of things just to do it “my way”. For me, this is a welcomed change from a language like Python where I often find myself spending hours trying to get an existing library to do the thing I want.

While Jane St has done a lot of good fort Ocaml, let’s not forget that they did split the entire Ocaml universe when they made Async. They could have evolved Lwt to suit their needs and support a common Ocaml concurrent runtime, but they chose not to. I think this is kind of the programmer Ocaml attracts currently. I, myself, have my own concurrency monad that I wrote and use quite a bit because I didn’t like how Lwt or Async handle errors. I feel no guilt.

So before one tries to start standardizing things, one should probably ask if the Ocaml is a community that values such things and wants such things. Currently I believe the answer is no. Maybe opam and dune are the first steps in that direction but it’s a long way to go to pull the rest in.

8 Likes

Thanks for your informative reply.
My sensation about the shaky/fragmented ground concerns mainly the development process rather than the stability and backward compatibility. Mostly because probably I don’t grasp it fully. That is: it seems that there are groups of people working on features independently without following any plan, and then they get merged in the next releases; but there’s no apparent and clearly stated planned direction in which to steer the development/evolution of the language (for instance: before starting this thread, I was not able to find all the info you gave). It feels like if everything is emerging spontaneously somewhere and being plugged in in case of need. I know that it is just a sensation and it’s probably very far from reality, but still, it’s there. I don’t mean to belittle the work being done, for which, on the contrary, I am very thankful.

2 Likes

I should note, that often purely centralized approach is not very suitable long term. Evolution doesn’t plan - it discovers.

2 Likes