Ahrefs is now built with Melange

Hello!

Since last September, the Melange, Dune, and Ahrefs development teams have been working to enhance the integration between Dune and Melange. As a company that uses a lot of OCaml in the backend, Ahrefs saw an opportunity to bring its frontend stack closer to OCaml by using Melange while still integrating with the JavaScript ecosystem of UI libraries. Thus, the company decided to invest and actively participate to make this integration happen.

I am happy to announce we achieved a significant milestone in this integration process: we transitioned all Ahrefs frontend projects to use Melange. We have explained this transition in detail in a blog post:

Regarding the current state of Melange, it’s worth noting that our focus thus far has been on designing and implementing the Dune-Melange integration and applying it within Ahrefs. The goal has been to demonstrate that the toolchain can scale and be used in mid-large codebases, and the result has been successful so far. The process has been beneficial not only for Melange but also for Dune itself, as we were able to identify and address some performance issues, including a significant performance fix that made some build commands nearly 10 times faster in our case.

While we’ve made significant progress with the Dune-Melange integration, we recognize that there is still work to be done to improve the documentation and developer experience. Currently, Melange lacks a dedicated documentation site, and the latest functionality isn’t yet available in published versions of Dune and Melange on the opam repository.

We’re actively working to address this, but in the meantime, we invite those who are adventurous to explore the melange-opam-template and review the newly added melange.emit stanza documentation found in the latest version of Dune’s documentation. If you have any questions, encounter any issues, or otherwise want to participate in any way, we invite you to join the #melange channel in the Reason Discord.

Thank you for taking the time to read about our progress with the Dune-Melange integration. We hope you share our excitement about this project!

36 Likes

I’m surprised I did not hear of Melange until now. Is the following correct:

compared to ReScript

  • Melange has better support for OCaml syntax
  • Melange has better support for ppx
  • Melange appears slower than Rescript at compilation (why?)

compared to Jsoo

  • Melange targets AST (like ReScript) not bytecode

  • Melange generates more readable JS

  • What is the comparison on OCaml features / libraries supported / not supported, compared to Jsoo ?

Thanks!

2 Likes

Hey @zeroexcuses,

thanks for your interest.

I guess this is expected. It is a recent project, and up until now the focus has been on making sure the idea is feasible and it works. With the first iteration of Melange, @anmonteiro proved that it was possible to upgrade the project to be compatible with the most recent versions of the OCaml compiler, and also model it as a “compiler libs” library, rather than a full fork. It also integrated with Dune, but at a raw level: Melange would generate rules for Dune to execute, but there was no concept such as libraries and such yet. The second phase, which has been completed recently, has proven that a deeper integration with Dune was possible and Melange is usable in large projects. I guess we are entering now a third phase of the project which will consist on documenting the toolchain, as well as porting over existing libraries and bindings, so other people can start using it easily.

Besides the points you mentioned, I’d like to mention a couple more:

  • One of the biggest upsides imo is having access to the OCaml editor platform. Years of effort in Merlin, OCaml LSP and extensions like vscode-ocaml makes the Melange developer experience really ergonomic.
  • Another difference is how package management is handled: while with ReScript every dependency can be downloaded with just npm, Melange projects will have to use opam and npm. This is a trade-off: on one hand, most Melange projects will have to deal with both package.json and opam files. But on the other hand they can benefit from opam’s source-based package distribution model for things like PPXs, linters, or any other OCaml tooling. By comparison, consuming any OCaml ecosystem tool in ReScript is more challenging. As npm is a package manager designed for an interpreted language, distribution requires using prebuilt binaries. And the fact that it is based on a now quite old version of the compiler does not make things easier.

I think this is caused by a combination of factors affecting compilation times. Some of them can be solved, some of them are by design:

  • Dune’s design requires walking the whole project tree when building it to find dune files. ReScript, as it’s based on Ninja, can read all the project information in a centralized way, from a single file.
  • Dune doesn’t use file modification times, but rather calculates changes digesting a hash from the file contents.
  • Integration with Dune forced Melange to build things in a more “staged” way, so that object files (cmj) and resulting JavaScript files can be generated in 2 steps.
  • Melange compilation process has not been fully parallelized yet.
  • ReScript optimizes the way the module dependencies of a given module are calculated by caching the results, Melange uses ocamldep through its integration with Dune and this step is not optimized / cached.
  • ReScript has a “tighter” design, which provides more opportunities to optimize things like the ReactJS ppx, which can be “fused” with the compiler. On the other hand, Melange prioritizes extensibility and ecosystem integration, so the ReactJS pre-processing is done through a regular PPX.

I am sure there are many other things that I am not aware of, maybe someone with more knowledge can add or refine the points above :slight_smile:

Again, probably missing some things, but from the top of my head:

  • Melange won’t support anything running on OCaml 5 (yet).
  • Js_of_ocaml allows to compile the compiler itself and create “toplevels”, which is not possible with Melange.
  • I think that Marshal support works well in Js_of_ocaml, while Melange does not support it.
  • Libraries like Unix or Str are available in Js_of_ocaml but not in Melange.

On the upside, in Melange:

  • Writing bindings is a bit easier in Melange (imo) thanks to the inherited ReScript compilation model
  • There is great support for libraries like ReactJS or GraphQL clients
  • Smaller bundles
  • Straight-forward integration with JavaScript tooling like Webpack, NextJS,… thanks to the 1 module:1 JS file compilation model

These three projects are complex, and there is a lot of nuance, hopefully this helps clarifying some of the trade-offs.

18 Likes

The link does not work. I get an error from Medium.

1 Like

Yes sadly there’s the problem with the domain and medium. The post can be accessed at Ahrefs is now built with Melange. OCaml, all the way down | by Javier Chávarri | Ahrefs | May, 2023 | Medium

Dune indeed relies on modification times. In particular, dune will not recompute the digest of a file if its mtime hasn’t changed.

1 Like

Thanks for the clarification, I was not aware of that part. Would it be more accurate to describe how Dune works as follows?

Dune does not rely solely on file modification times to infer that a file has changed and needs to be rebuilt, but it adds another check based on the file content hash.

I understood this two-steps approach is beneficial when some tools change the modification time, but the content remains the same as the last build, like when checking out branches with git.

1 Like

The original link should be fixed now.

Hello @jchavarri , I’m curious: in OCaml List seems to be the prevailing code structure, but it is inefficient in JavaScript (rescript for example flipped its syntax, so that arrays are defined with [])

In the frontend are you using arrays or lists?

Hi @mudrz, I don’t believe in OCaml there is a prevailing data structure. What I see people doing and recommending is to use the right data structure for the job. The OCaml docs have a great section about the different options and how they perform depending on the actions they are used for: Comparison of Standard Data Structures · OCaml Tutorials.

In the frontend —like in the backend— we use the data structure that better fits the problem at hand. For small collections of just a few items I don’t think the different between arrays or lists will be noticeable for the majority of applications. For large collections, assuming you have to optimize for insertions, then lists are probably more performant than arrays: Benchmark: Array vs Linked List - MeasureThat.net. If you need random access, probably arrays are better. In general, I’d be suspicious of anyone claiming there is “one data structure to rule them all”.

About syntax :curly_loop: , I can only say that flipping syntax in Melange would break compatibility with OCaml. So I assume [] will remain the syntax for lists in Melange because staying close to OCaml is one of the project design goals.

6 Likes

I should expand what I mean - List seems to be a much more common data structure in native OCaml than Arrays - I consider it the prevailing data structure between the 2, because even JSON libraries represent collections with lists as opposed to arrays

While in JS since arrays are implemented natively by the javascript engine, they take less space (for example the compiled code) and operations, on average, are faster.

What I mean with ‘whether you use arrays or lists in javascript’ is not to deter picking the right data structure for the job, but rather that most reason/bucklescript frontend code bases I’ve seen pick either array or list for client-server communication and rendering of lists in React - it is simply more practical than converting between the 2 depending on what gives marginal benefits in a particular scenario.