The end of Camlp4

As a language it’s terrible. But I don’t think the stability has been the problem there. I also think it’s vital that one be able to view historical academic papers. You literally can no longer do anything with a Word document from 1990; if you have an old paper in Word, the only way to preserve it is to reformat it from scratch from a paper copy if you have one. TeX, by contrast, is “safe”; if you write something in it, you know that in 40 years it will probably still be possible for people to generate a formatted version.

The language itself reflects Knuth’s 1960s era sensibilities, and is a very bad design. But had it been unstable, it would have been a bad design and strictly less useful.

Perhaps it is indeed always possible to write future-proof code, but I think people generally expect that if a tool works today there should be little reason it will break tomorrow, and only experts know what’s brittle and what isn’t. One can of course just blame the users, but that smacks a bit of people saying “well, if you’re a really good programmer, you can write decent C code that doesn’t have undefined behavior in it, it’s your fault if you make a mistake.”

I think we should presume that most people want tools that don’t require a high degree of human perfection to use. We like strongly typed languages because they help the user even when they are imperfect. I think a language ecosystem should, in general, be supportive in that way.

1 Like

I agree with @dbuenzli 100% here. I think a key point to remember is that while modern OCaml users love 90% of the language and its existing ecosystem, there is/was that 10% of the language/ecosystem design that’s lacking. Case in point: mutable strings; the build system nightmare pre-dune and opam; the (still) severely lacking stdlib. OCaml users of old knew about these issues, but just worked around them. (Camlp4 is a little different – I think users loved it, but the core devs explained it wasn’t a good idea and the users understood why). Keeping these problems around indefinitely for legacy compatibility is a terrible idea – languages are extremely competitive, and if you can’t provide value compared to other languages, you will die off.

In fact, I’d say we still know that there are weaknesses: the stdlib is still lacking; the grammar itself is problematic around if/then/else (compared to other languages); and we all see the deficiencies of functors as compared to type classes – there are now 2 popular type class based languages, and OCaml ultimately needs something similar to compete. I wouldn’t expect the core devs to ignore these needs because of legacy reasons.

One thing that is useful is to build versioning into as much as possible in anticipation of change. For example, if the stdlib were versioned, we could have a command line argument (provided by dune) to switch to the next version of the stdlib when it arrived, with the option to keep the old stdlib version. If the grammar were versioned, we could have a new grammar that fixed the small (but annoying) issues that pop up, and the old grammar would still be available using a command line argument, etc.

1 Like

So versioning and mechanisms like it are a means of maintaining strict backwards compatibility. It’s possible to construct mechanisms like versioning, code upgrade tools, etc., that allow you to move fast and throw away bad ideas from the past while still allowing old stuff to work without breaking.

Personally I try to avoid ppx for this very reason. It tends to break at
every release for a little while, and fundamentally depends on a private
unstable part of the compiler’s API (the parsetree).

My main other concern for stability of my own projects is dune. I really
hope my dune files will still be valid years from now. That’s maybe the
academic in me : sometimes you write a project for research and leave it
untouched for years, and it’s annoying if it requires a ton of efforts
to come back to it later.

Please, don’t interpret it that way. My point is not that we shouldn’t change or deprecate. My points are:

  1. don’t break unless it is strictly required;
  2. do not imply policies, let the users decide what is good and what is bad.

There are lots of good changes, and OCaml is a much better language, then it was 5 years ago, for example. Moreover, indeed OCaml, as a language, wasn’t breaking anything for a long time, it actually started only recently.

Well, it is in your reality, in reality where camlp4/ppx are not used. Let’s put it straight, your style of OCaml coding is quite different from what is, for example, proposed in the Real World OCaml book.

Speaking of which, this is the latest and like the main book about OCaml… which is now a total waste of paper, because it is now outdated. This brings me to another story, where my fellow professor was planning to teach a course using OCaml and base it on RWO. A couple of years later, they abandoned this idea, because this book is no longer working. They also abandoned OCaml and now teach in Go.

I’m not talking about the management, I’m talking about sponsors and business realities. Yes, the maintenance is needed, we all know this, and we include the cost of it into the product life-cycle. The problem is when the maintenance cost of one language is much higher than the other language, then we have to change the language. Sometimes, rewriting a project from scratch in another language could be cheaper than maintaining in an infrastructure that breaks every other year.

I don’t want this discussion to be bogged in details of each particular solution to particular problems. For what it’s worth, most of those changes could be done without breaking any code. Immutable strings could be just added as another type. We don’t need to break the standard library to extend it, and we now have opam, so it is not really needed to put everything into the trunk.

3 Likes

That’s an interesting point. Reminds me of Haskell adding the Text type, though I don’t think that went so well either. We could have kept string mutable, and added an immutable `text’.

That’s mostly due to the fact that it hews very closely to Core, which is a library that does pretty much whatever it wants. That was never a good idea IMO. In fact, any book is going to age badly given how slim our stdlib is.

EDIT: Thinking about it some more, I don’t know of books for programming languages that don’t age badly, unless they’re very basic or the language doesn’t advance at all.

Unfortunately, for me at least, there are things in the ppx ecosystem (like ppx_deriving_show) that are just too attractive to avoid (in that particular case, because it lets me printf debug very easily by dumping out values. Would that we had modular implicits and show for every type but we don’t.) There are also some that one could avoid (say, ppx_regexp) that are still too convenient.

Perhaps one should ask the question of why the parse tree needs to change so much with every release?

1 Like

That’s a very alarming story. OCaml is not super popular, and instability could hurt it even more. It would be great if RWO second edition would avoid using ppx whatsoever to avoid such problems.

1 Like

To be honest it’s not that complicated: program in ML not in MML.

Besides and that is true for any tech, don’t think because one successful company with a lot of ressources is using and publicizing certain tools that they are good or what is needed for your own projects.

I can’t be responsible for what this book proposes. If it teaches brittle practices then maybe people should stop recommending it to newcomers (I usually rather recommend John’s series of books). Also it’s not my reality, anyone can follow suit, sniffing and making technological choices is part of the trade.

4 Likes

ppx_deriving_* provides what should have been a part of the language itself, in my opinion. Haskell and Rust both have the feature out of the box after all. And it saves a lot of time. I can’t imagine any big project without such a feature.

4 Likes

Again, if you peruse through the first edition, you’ll see that the OCaml language basics have remained mostly the same. It’s the standard library parts that have completely changed, because they wrote the book for Jane Street’s library rather than the stdlib, or some extension thereof which the authors could control.

Maybe people should rather ask the question how can I solve this problem in the language itself.

I really wish camlp4 or ppx had never existed. Much more interesting papers would have been written and library developed on how to solve these problems in the language itself rather than people immediately resorting to the meta programming hammer.

3 Likes

As a small reminder, by that time camlp4 was a part of the OCaml compiler, and was rather new (and not yet even documented feature) introduced instead of camlp5 (which was camlp4 before that). Therefore, it is the language that changed a lot. By first switching to ppx, then removing camlp4, then deprecating at all. But, again, that’s not to say, that this was a bad move.

Well, first of all, not everyone are having your experience or perspective. A young developer, or just a newcomer to OCaml, is not able to distinguish between what is right and mature and what is the hype of the day and will die soon. Especially, it is surprising, when it is a part of the language itself, which by everyone meant to be written in stone.

Well, in the projects that I build, we usually have thousands, yes, literally thousands of data types. So even if it will be one or two lines of codes per data type (in fact it is much more) it will end up in thousands of lines of code. Of code, that is very low-level, not really covered by the type system, so you have to write also thousands lines of testing code, and spend thousands of hours to actually fix the bugs. And this all ends up with money and time loss, missing deadlines, and other problems. And this is the reality in which I live. I just can’t allow myself not to use it, despite the fact, that I will have to pay this technical dept later.

3 Likes

I tend to agree. This is one of the reasons I was surprised when I first joined the community that things like modular implicits (which would make some such features easier) weren’t a high priority. But at the moment, such features are not a high priority, and one needs to use parts of the ppx ecosystem to get them. And so I use the ppx ecosystem.

2 Likes

Perhaps true. That said, as such papers haven’t been written (and such features aren’t in the language itself) one finds oneself wanting to use ppxes.

This is an argument to have a standard derive mechanism that ships with the compiler (as in haskell or rust), imho. Something less powerful that full ppx, but that just works™ and can be used for printing/equality/(de)serialization using a basic stable API.

9 Likes

I have wondered since almost the start of my involvement with OCaml why such a thing doesn’t already exist. They’re things one very clearly wants.

1 Like

It may be enough to just have a copy of ppx_deriving/lib/whatever be inside the compiler repo, with a few select ppxs, and test cases to make sure these are always up to date with the latest compiler changes.

There’s also another perspective on this, which is that the compiler versions are released too soon. I’ve mentioned this idea before, but we could have more stages to the compiler release, where the core team is no longer involved, but essential parts of the ecosystem must be updated before anyone else can even touch the new version.

Some are written.

The question is how much does it evolve over time. You can also elect to have that as a source level generation step once to make the first big shot and then let it evolve without depending on the whole ppx infrastructure.

1 Like

Same for me, perhaps it’s because it’s a problem that has no really satisfying solution. The impression I get is that in any language you can have either static type checking or you can have universal printing/equality/(de)serialization, but you can hardly have both without paying a price in terms of performance or complexity or power of the language.