The end of Camlp4

Dear community,

As you may know, the Camlp4 project has been relatively inactive for the past few years. With the help of @gasche, @dra27 and a few other contributors, we have been keeping the project alive so that existing OCaml projects using Camlp4 could continue to build with new OCaml compilers.

At this point, we expect that the most used features of Camlp4 are covered by ppx, and in particular all the effort is focused towards the latter.

We feel like the time has come to officially abandon Camlp4. In order to help distributions and other package managers deal with this fact, we will soon release a 4.08 compatible version of Camlp4. This will in particular help getting OCaml 4.08 in Debian. However, this will be the very last release of Camlp4. After that, the project will be abandonned and will no longer receive updates.

Of course, anyone interested in taking over the project is very welcome to do so. Please get in touch if you are interested and we will happily arrange for the transfer of ownership.

Migrating away from Camlp4

There are a lot of code bases out there still using Camlp4. If you need to bring an old project using Camlp4 to the modern age, I recommend reading this blog post which describes in details how the huge Jane Street code base was migrated from Camlp4 to ppx. In particular, the camlp4-to-ppx tool mentioned in the blog post is availabe on github.

Stream parsers

One of the pain point of migrating away from Camlp4 is the stream parser syntax. In the past, the OCaml compiler used to have a syntax for stream parsers. When Camlp4 was merged into OCaml, the stream parser syntax moved from the OCaml parser to Camlp4.

Nowadays, Camlp4 is the only tool that understands the stream parser syntax. Stream parsers are not considered to be a great design and we encourage users to explore alternatives such as lazy lists or parser combinator libraries.

However, may you need to upgrade a large code base using Camlp4 and stream parsers, it would be quite easy to develop a ppx syntax for stream parsers and automatically upgrade the syntax using camlp4-to-ppx. If you would like to explore this possibility and need pointers, do not hesitate to get in touch!

Many thanks,

Jeremie

23 Likes

I exported the published packages list that still rely on the camlp4 (I removed all pa_* packages as obsolete, and some of the packages that dropped camlp4 already, but not yet released, like coq or cocinelle):

So it makes sense to migrate the packages either to ppx, or to menhir/ocamllex/angstrom/whatever. Or, in the worst case - to camlp5. And if the project is dead - set the upper limit of the OCaml version to 4.08 in the opam file.

As a first approximation, it seems fine to put a upper bound on OCaml for all these projects and wait to see if anyone really need them. I can see a few projects that have been replaced by ppx equivalents, such as custom_printf or type_conv. Some projects such as lambda-term have a new version that is not using camlp4, and some are just obsolotes such as text or estring (these are old projects of mine).

some are just obsolotes such as text or estring

Interesting to know. So we can remove them from Debian. Any guidance like that is very much welcome.

Absolutely!

From the list above, here is a list of project you should feel free to remote from Debian:

  • enumerate
  • estring
  • faillib
  • format
  • herelib
  • optcomp
  • pipebang
  • text
  • type_conv

These are all old Jane Street projects or old projects of mine. I’m pretty sure quite a lot of other ones can be safely removed as well.

1 Like

here is a list of project you should feel free to remote from Debian

Thank you. Actually removing them will take some time, since they have reverse dependencies (especially type-conv). I’ve updated our wiki with this information. I am aware some of the reverse dependencies have newer upstream versions that remove the dependency. I guess some of those will have to be removed as well if nobody cares about them in Debian.

AFAICT, faillib and format are not in Debian.

There is one more dependency, which is pretty popular and even serves as an entry point to the language for many students. It’s the LLVM OCaml Kaleidoscope set of tutorials, which heavily relies on the stream syntax. I also believe that there are some courses in the wild that use the stream syntax.

Also, what about camlp5 is it also no longer supported?

2 Likes

AFAIK, camlp5 is still maintained by its original author.

Regarding the stream syntax, we discussed it on the OCaml developer mailing list and nobody think streams and stream parsers are a great design. Given that, it seems better not to use them in tutorials and courses. Do you know who maintains the LLVM OCaml Kaleidoscope? Maybe we could reach out to them to let them know about this.

1 Like

Lately I prefer writing parsers with Anstrom. It offers a lot for writing easy to understand, still performant parsers: https://github.com/inhabitedtype/angstrom

1 Like

So, there is at least one tool that still supports the stream syntax.

Yeah. Probably camlpX wasn’t that great idea, and we’re currently paying the technical debt for it.

It is maintained by the LLVM team, don’t know by whom in particular. And yes, an ideal solution for OCaml Kaleidoscope would be to rewrite it using modern OCaml and parser combinators, but it is quite an effort. Not a project for a couple of hours.

We should also understand that dependency on camlp4 in opam is not the best indicator of the package demand. There are lots of private/commercial projects that are not published to OCaml. Second, there are a few users who prefer to use the OS distribution systems or other package managers (such nix or guix), so a better indicator would be to gather statistics from those resources, as well as account for the number of installations of the camlp4 in opam. For example, camlp4 was installed 2500+ times from opam only for 4.07.1, and there about 3000 installations of camlp4 in Debian each month. From this, we can conclude that camlp4 is still a major (if not critical) component of the OCaml infrastructure and dropping it might have the corresponding impact (it might not, it’s hard to predict, of course).

I understand, of course, that you have your reasons and that support burden of camlp4 is very high, but I just would like to highlight that camlp4 is not some old package that nobody is using anymore. There are actually quite a few users. (And many of them are not visiting discuss or are subscribed to the OCaml mailing list, so they will notice the lack of package only several months if not years later when it will hit the major distributions).

1 Like

There are actually quite a few users.

Then maybe one of them can take on maintenance of camlp4? In Debian, I chose to not do that (for the time being) and instead focus on removing uses of camlp4. Depending on the progress on camlp4 removal when I want to update OCaml to >= 4.09 in Debian, I might change my mind.

And many of them are not visiting discuss or are subscribed to the OCaml mailing list

If they use opam, they will notice that camlp4 will stop being installable and if they dig, they will surely discover this page. If they don’t use opam, then probably they don’t need a recent OCaml version anyway.

4 Likes

It doesn’t work that way :slight_smile: Eating a cake and cooking the cake are very different kinds of activities requiring quite a different investments and skills. Many of the users of camlp4 are actually students and researchers. Again, OCaml Kaleidoscope is a good example, as it serves as an entry gate to the language. Imagine, a young research is using this tutorial to learn how easy it is to develop a new language or analysis with OCaml in LLVM and the decides to learn OCaml after that. Now, after camlp4 is dropped, he or she won’t be able to use the tutorial (and remember, the young researcher is not even aware of the mere existence of camlp4, he is just following the default instructions for OCaml installation), so we are losing a potential OCamler.

But they don’t really have a choice. You have to run to stay on par with OCaml. It is not like there is an option to install old versions of the compiler/opam and that the old versions are supported. Most of the code that uses features that are not supported by the latest version of OCaml is doomed to die in a couple of years.

1 Like

I think we can update the tutorial not to use camlp4. That would take only a few hours. It’s not that complicated.

3 Likes

Yep, for OCaml Kaleidoscope it would be an optimal win-win solution. In fact, this tutorial needs some love.

I’m going to look at doing this (unless someone is familiar with a pre-existing modernization of the tutorial.) What are the other things you think should be updated besides just the camlp4 based parser?

Would it make sense to turn the tutorial into a dune+ocamllex+menhir project? Perhaps with a matching github repo?
Afaict writing a lexer+parser with ocamllex+menhir is a bit foreign a first, but also much more convenient than stream parsers. And a real compiler will be better off with menhir anyway.

5 Likes

That is what I would want to do, yes, along with improvements to the tutorial’s (pretty bad) human explanations of what the code is doing. Violent agreement.

2 Likes

It might be in line with the ongoing LLVM project migration to Git[Hub].

1 Like

Yeah, honestly I think it is better just to reimplement both the text and the code of the tutorial that concerns parsing, rather than attempting to implement theirs recursive descent parser with manual precedence resolutions. This will make the tutorial clearer and more focused on actually implementing the backend (which is actually the main purpose of the tutorial).

What are the other things you think should be updated besides just the camlp4 based parser?

The use of the Stream module (which is pretty unusable without the stream syntax). Just style issues, you will see them definitely.

Also, it might be a good idea to switch to dune.

1 Like

Yah, my thought would be to fully rewrite the parsing, maybe even most of the tutorial, modernize the build (to dune), and dramatically improve the text.

1 Like