Why is building Ocaml projects still so hard?

This is great and I’m aware and thankful for all this work. And yes things have improved a lot.

I think whats a little frustrating on my end, that always when things are getting better, there is a new technique for introduced, that breaks the thing I like to build and use. (Keywords: ocaml build systems/ppxlib/camlp4/camlp5/bucklescript/)

Sorry @alan but I don’t understand what you’re trying to say here but I’m not sure I fully agree with your description.

I dont think separating tools for Building (dune) and Packaging (opam) is the problem, but the fact that that they use completely different and therefore separate configuration languages. The level of redundancy is just baffling.

As cool as this tool is, its existence and particularly its own history somehow prove my point. There is too much movement and compatibilities are broken too easily. The Ocaml platform as a whole seems to be in a constant research mode.

I was in the community back in the day when there was no forum but just a mailing list. That was a time when the biggest issue was if somebody posted in the wrong list (beginner vs. developer lol). There was never something wrong with Ocaml.
So users where discouraged and then user number declined. I sense the same spirit here. That should be your concern I think.

I’m ignoring after this post, as I do not believe you and I will have a productive conversation. I am stating two points, in case silence is interpreted as concession.

  1. I see nothing wrong with my post history. It shows (1) I ran into a large number of brick walls, (2) I asked for help, and (3) I worked through them.

  2. I’m really grateful for projects like GitHub - jchavarri/jsoo-react-realworld-example-app: Exemplary real world application built with Js_of_ocaml and React . Starting with a blank slate is very very difficult; but starting with a 99.9% working build is easy. In particular, it overcomes the psychological hurdle of “is this even possible at all”. Instead of comparing the repo against a hypothetical perfect repo, it may be helpful to compare the repo vs nothing.

2 Likes

I have to confess, I think you’re gravely mistaken. Yes, dune is … complicated and opaque. But opam ? That stuff is great. And contrary to what you’re implying, it’s not all sweetness and light everywhere else. An example: In Rust-land, a package that is <1.0 (which almost all packages are) doesn’t get proper dependency-checking during release via crates.io. And cargo doesn’t actually get the versions right, either.

Here’s my post about it, back when I was learning Rust: Cargo pulling in multiple versions (but inadvertently) - The Rust Programming Language Forum

Basically, cargo silently and automatically vendors a package, causing a build-failure, when if it would just choose to not vendor the package, everything would work out fine. And there’s no way to tell it to not vendor. It was a mess.

Yes, there are packages that aren’t available thru opam. Yes, sometimes it takes time for packages to get updated in opam. But what you get in exchange is that pretty much, if opam tells you a package will install, then it will install. That’s enormous, and it succeeds even though we have a massive community of developers who work completely independently.

This is a massive achievement, and it’s due to the work of the opam maintainers and the quite thorough CI testing they. It’s pretty damn thorough – far more thorough than I would do myself. And it’s worth it.

By the way, I never use dune. Hate it. Just hate it. I write Makefiles for everything I build, and I built quite a number of packages: camlp5, some support libraries, and 10 different PPX packages (based on Camlp5), one of which is rather massive, covering both the function of ppxlib , ppx_import, the standard derivers, and a bunch of other PPX rewriters. And they all work, and OPAM-CI tests them just fine.

P.S. I should note that the response from the Rust community was “working as designed”. One thing that the opam maintainers would not suffer, is that the “design” of the package system would cause a collection of already-deployed packages to fail to build, when used together, even as they each separately built just fine. I think they’d see that as a bug.

3 Likes

Here is what I understand about what stability ppxlib offers:

  • Before handling the AST to the rewriter, ppxlib uses its own Parsetree migration to migrate it to the last version it knows. So for ppxlib.0.29, it’s the 5.0 parsetree, but for earlier versions of ppxlib it might be earlier parsetrees. So if the rewiter gives upper and lower bound to ppxlib, it knows it will get a certain version of the parsetree, whatever the version of OCaml is in the current switch.
  • Of course, it is not satisfactory to add constraints on ppxlib: different ppx will have different constraints which will end up a mess. So ppxlib offers two modules that are “parsetree version independent”, one for building nodes, and one for pattern-matching nodes: Ast_builder and Ast_pattern (also, one for traversing nodes).
    Those work on every version of the parsetree. Their API has to be stable otherwise the problem will stay the same. So when new features are introduced, a solution has to be found. It can be by adding optional parameters but that’s not always possible or satisfactory. Recently a Latest module, which is unstable, has been introduced, allowing to use the latest features. But I think a better option would be to have Since_<version> modules, which allows to have at the same time stability wrt upcoming releases, and possibility to use new features.

In the case of ppx_jsobject_conv, the PPX is matching directly on the Parsetree. The parsetree has changed since it was written (Pext_decl now has three arguments) so it’s not working anymore. This would not have happened if the PPX were using Ast_pattern.pext_decl.
A workaround is to use an old version enought of ppxlib, so that the last version it uses is one where Pext_decl has only two arguments. It was introduced in OCaml in this commit, which was released in OCaml 4.14. ppxlib started using the 4.14 AST in its version 0.26.

So opam install ppxlib.0.25.1 should do the trick here. And, PPX users should write parsetree-independent code as much as possible to avoid this situation.

2 Likes

Dear @Chet_Murthy
Just for the record:

  • it was not my intent to criticise any of the tools (opam or dune)
  • I suggest that the opam/dune dichotomy together with the fluctuating Ast rewriter (ppx) causes a lot of troubles for me as an end user
  • I too think opam was and is a massive improvement
  • the post was created out of frustration and I acknowledge that the parts that compare the Ocaml experience with other systems were unfair and also not factual correct

Do you know more about this? What do they do exactly? Could I learn from their process?

I’m also not a big fan. It’s a huge dependency and there is more magic involved that I’m comfortable with. Also it is constantly changing.
But from an end user perspective, it never caused any problems, it is reliable.

No worries, and some further thoughts:

  1. Like I wrote, I personally detest dune, and find its opaqueness to be really problematic. That, and its focus on OCaml – at least last time I tried, you couldn’t really use it to build anything not focused on OCaml, and since I have quite a few times worked on projects that are part-OCaml, part C/C++/other languages, that’s a deal-breaker for me. I find it a great thing that I don’t need to sacrifice opam, in order to use Makefiles.

  2. OTOH, yes I kind of agree with you about PPX: it’s a big pain. But I believe that the reason isn’t anything intrinsic to PPX rewriters[a] and that the real problem with PPX is that it lacks quasi-quotations. I maintain Camlp5 and its associated pa_ppx PPX rewriters, and have done so for a while. And while sure, there’s some porting work with each new major release of OCaml, that work is pretty quick, and doesn’t take very long. And why? Because while the AST might change (e.g. Pext_decl) from release-to-release, the concrete surface syntax cannot change except in very, very backward-compatible ways. In my experience, that’s enough to make most of the pa_ppx-based rewriters “just work” with new versions of OCaml. I’ll also note that Rust’s macros are pretty much based in the surface syntax, too.

Anyway, I believe that OCaml’s PPX can have powerful quasi-quotations too. I wrote a post about it a few days ago, and have a prototype: GitHub - camlp5/pa_ppx_parsetree: Tooling for doing things with OCaml's official AST

[a] Rust (for example) also has macros. But Rust macros don’t need to be as complex as OCaml’s, b/c Rust has traits (ad-hoc polymorphism / “modular implicits”) and that offloads much of what makes OCaml’s macros so complex.

1 Like

I’d like to clarify a couple of things about ppxlib and the PPX ecosystem and how it influences building software with OCaml.

The initial statement of this thread was that it’s generally “hard to build Ocaml software”. The example you gave to support that statement, @koala, was a PPX incompatibility. That example was very good to read about since it shows us the limits of our compatible PPX ecosystem efforts! However, I don’t think it’s a good example to support the thesis that it’s generally “hard to build Ocaml software”: as @dbuenzli has pointed out, the opam-repository ecosystem provides a great experience, including PPXs! And as @davisnx has pointed out, the problematic PPX of your example isn’t on the opam-repository.

About the question whether the PPX in question, jsoo-react.ppx, did something “wrong”: “wrong” is a strong word, but I think there are things that both we, the ppxlib maintainers (possibly in collaboration with other ecosystem tool maintainers) and the maintainers of jsoo-react.ppx could do/have done better. For the latter: when writing a PPX that’s not released on opam, I strongly recommend adding an upper bound on the ppxlib version (that’s not necessary or recommended for PPXs released on opam).

Some of the comments have been about the question of what has been improved over the years in the PPX world to provide good compatibility. The answer is: lot’s of things :stuck_out_tongue: . Let me focus on the three things that are the most relevant in this context. One is opam and the opam-repository themselves: as mentioned, thanks to the opam-repo maintainers, the opam PPX ecosystem is always consistent. The next one is related, but an effort on the ppxlib side: whenever we cut a release that adapts to compiler changes which break PPXs, we create a big work-space with close to all opam PPXs and patch the ones we break. That makes sure that the opam PPX ecosystem is not only consistent, but also up-to-date and compatible. And the third one is the one @panglesd has mentioned in his answer above: since ppxlib.0.26.0, we’ve considerably improved our way to adapt to compiler changes. To give an example: the 4.14 compiler release introcued quite a fundamental parsetree change. For those familiar with the parsetree: Ppat_construct was modified. Without the improvement we did, we would have broken over 30 PPXs. With the improvement, we broke “only” 9 (and, of course, patched them). Side-note: notice that modifications in the parsetree on nodes like Ppat_construct aren’t very common, so these high numbers aren’t usual.

About a couple of other comments:

trying to get all ppx-es used in a project to agree on a non-empty intersection for ppxlib version bounds and such can be challenging"

This should not be the case. If you have a recent example of this being challenging, it would be very interesting to hear about it!

the real problem with PPX is that it lacks quasi-quotations

For a -limited but very helpful- quotation and anti-quotation system, you have metaquot. Possibly, one of the things you’re saying is that it would be nice if metaquot was a bit more powerful. I agree with that. However, as @panglesd has pointed out, metaquot isn’t the only way to write a stable PPX: since ppxlib.0.26.0, the Ast_builder and Ast_pattern modules also provide a stable sub-API.

7 Likes

This is pretty unpleasant IMO. I don’t think ad hominem insults as in the first quote have any place here, and a confrontational response to a quite friendly tap on the shoulder doesn’t help, either.

5 Likes

Thanks for acknowledging my initial issue and much more thanks for your taking the time to write such a detailed answer. Unfortunately I understand not even half of it, but let me ask this probably ill informed and naive question:

Would it possible handle PPXs like some form of a compiler extension or plugin and not like a library? And then maintaining an approvement process where the packaging and distribution of PPX are somehow stricter managed and have to actively ask for permission in order to work with the ast of the compiler?

In my experience the cross section between people who like to experiment and author PPX and the people who like to maintain and run ocaml code over a longer period of time is rather small. I see now the value of PPXs but I’m still not sure if they are worth all the trouble they cause.

From a theoretical point of view, to a certain extent that would be possible. From a practical point of view, that would mean adding a lot of complexity and choices to the compiler and leaving the compiler maintainers (who already have a lot of work to do and are doing a great job IMO!) with an important new section of maintenance burden. So we’re planning to do a hybrid between the two things: upstream some parts of ppxlib to the compiler, concretely a library called Astlib, while keeping the main ppxlib logic separately. To be clear: I personally also think that a compiler-integrated macro-style system would be great in a theoretically ideal world, but practically I think we can achieve a good PPX ecosystem most people will be happy with.

I’m not sure what you have in mind when saying “rather small”, but I think lot’s of people trust the PPX ecosytem by now. As mentioned a couple of times: as long as you stick to the opam-repo, you should currently not get into the PPX compatibility problems of the kind you ran into. Here’s what the numbers say about the intersection you’re mentioning:


$ opam list --depends-on=ppxlib --recursive -s | wc -l
1836

PD: I’ve already opened a PR on jsoo-react.ppx to add the upper ppxlib bound I mentioned before.

2 Likes

To give a bit of historical background: once upon a time the OCaml compiler used to have a built-in mechanism for syntactic extension, it was called Camlp4. Maintaining it was such a nightmare that it was decided to get rid of it.

Later, the PPX mechanism was introduced to have something to offer to those users that still wanted to have some way of extending OCaml’s syntax, see ppx and extension points | LexiFi and the references therein for the history (especially the discussion that took place in the mailing list).

But the fact is that is that this was thought of as a kind of “escape hatch” for specific situations and no-one thought that a cottage industry was going to develop on these foundations, and that the community was going become so dependant on this techonology. Be as it may, PPX has some fundamental weaknesses (eg the dependency on the Parsetree types) which are hard to get around. And it is not even clear that the syntactic approach of PPX is the best solution to many of the problems it is applied to (as opposed to compile-time macros, type reflection, etc).

Cheers,
Nicolas

5 Likes

Compile-time macros that translate string literals into code would be a downgrade on the part of editor support and error reporting. Compile-time macros that are just staged OCaml code would preclude or degrade a lot of use cases, of two broad groups: code instrumentation (for debugging or measuring), domain-specific no-boilerplate embedded languages.

1 Like

I’m new to the OCaml ecosystem myself (and also new to CI/CD) but published a package on opam recently and thought the automated checks used by the maintainers are pretty cool.

Some of the cool things I’ve seen their CI checks do include:

  1. Build the package being submitted on (a) different compiler versions (including what seems like more obscure variants like the flambda version), which helps identify the minimum version (“lower bounds”) that your package builds with, and (b) different operating systems like Windows, Mac and various Linux distributions to help ensure the portability of your code.
  2. Retroactively adding an upper bound (the maximum compiler version your package builds with) as other packages which your package depends on receive updates.
  3. Testing the package with your unit tests after it is built on each compiler version and operating system (useful if, for example, one of your dependencies has a bug fix that makes your test fail on one version of the dependency but pass on a different version).

I wouldn’t have done all of those checks myself and it helped quite a bit. The first point about building on different compiler versions let me notice a build error about how my package didn’t build with the flambda variant as a generated source file was unreasonably large. The maintainers seem like nice people too who were patient in helping me get my package on there.

12 Likes

Yes, you have to scan all the opam packages and reverse the mapping. I guess if you wanted to do that on every build it might become a performance problem, but it’s pretty quick and only needs to be done when a dune file is changed.

1 Like

I it is more than a year ago since I’ve posted this rant and today while I was doing some some obscure things on the terminal with melange and opam, starring on the screen while waiting for an opam update to complete I came to the conclusion. I still mostly have no idea what happens behind the curtain, but I can not remember when I ran into serious problems the last time. I cannot remember my last ppx related error. Things really have improved and I guess hard work was involved in this. So thanks a lot to all of you for constantly improving things.

12 Likes