Why Lean 4 replaced OCaml as my Primary Language

EmileTrotignon · August 20, 2025, 9:02am

yes but not anywhere right ? Like there is no block just to introduce a new scope, they all do something specific, and there is the statement vs expression split

silene · August 20, 2025, 9:15am

Sure, C does have separate grammars for statements and expressions, but that it is completely orthogonal to hygiene. Even if the ({ ... }) syntax extension was eventually standardized in C (it is long overdue), it would not change the fact that an unhygienic macro system is unhygienic.

giltho · August 21, 2025, 10:22am

Thank you for the write-up @Gopiandcode. While I agree with some of the people here that some parts of the write-up do not fully carry the ideas, I still think that I agree with most of the issues you have with OCaml, and I think they should not be dismissed because the blog post is not as carefully written as a scientific paper would It’s a blog post, it’s not meant to be perfect

Specifically:

Lack of optimisations: I do agree, but I’m also ok with what I have to be fair, my applications don’t need more perfs, OCaml’s already very performant as it is. Also I’m happy the fact that, when I look at places, I can optimise hot paths should I want to. I was told that haskell is particularly frustrating in that spending time optimising code can sometimes have negative effects.
Conservative release cycle: I 95% agree with this, up to the point that there are very few people that have the skills and time to review things like the Stdlib, and they do review and make the progress they can. I also wished we had features like modular implicits by now, but we don’t and it is what it is. I already think that effects are a massive quality of life improvment (even without an effect system).
Opinionated build system: that one I disagree. Nothing prevents you from calling a makefile from Dune. Writing your own rules to produce text is not difficult. I do agree that the ecosystem as a whole is too full of paper-cuts, but I can live with a not-perfect ecosystem to use a great language.

One mischaracterisation of your post is that the first PR was closed because of naming debates, that is not correct as far as I understand. The first PR was closed because the structure had an unnecessary indirection in each element. I did get frustrated by the fact that, by the time Dynarray was merged, OCaml 5 had come out and they ended up converging with… the implementation with an indirection but

Now the one opinion I really agree with is the problem of meta-programming. I don’t need all the power of Lean’s meta-programming, as @yawaramin said: most use cases need simple macros, and I completely cannot write these without resorting to a ppx, which is way too unwieldy for writing simple wrappers. Hygienic macros written directly in Rust are the one feature I really really envy from Rust. The second is the tokenstream-based macro interface, instead of AST-based interface, which makes everything much more stable.

lukstafi · August 21, 2025, 2:14pm

I like the AST-based interface, as it makes syntax extensions work better with other tools: LSP, ocamlformat.

Chet_Murthy · August 21, 2025, 5:28pm

A few years ago, while working heavily with Rust (for IBM Quantum), I did some analysis to figure out what it would take to implement Rust-style macros for OCaml. I maintain and use Camlp5, so it wouldn’t be very hard to hack together the parsing and AST support. But I realized pretty quickly that anything other than trivial uses, such a facility would be useless, b/c too often macros depend on type-directed compilation (that is to say, “typeclasses”/”modular implicits”). E.g., the “debug-print” macro.

Chet_Murthy · August 21, 2025, 5:37pm

As I mentioned in my comment up-thread, you can get this in OCaml with a sufficiently robust meta-quotation facility. What there is today is not “sufficiently robust”, b/c macro-writers do not use it almost-exclusively for AST manipulation.

I think it would be a useful goal for somebody to decide to make metaquotation sufficiently powerful that most uses of direct AST manipulation disappeared in PPX rewriters.

I will note that one of Daniel de Rauglaudre’s original insights that drove Camlp5 (originally Chamau) was that the homo-iconicity that made LISP/Scheme macros so effective, had its counterpart in ML, not in direct AST manipulation, but in powerful meta/anti-quotation facilities. One can throw away a lot of Camlp5 and replace it with more-standard stuff, but this part remains as far more powerful than anything else on offer in the OCaml ecosystem.

giltho · August 21, 2025, 9:30pm

Not quite sure what it helps with in these cases

Oh interesting, I never thought about this. But I don’t think that’s fully true. In the past, I wrote extensive macros in Rust to define small DSLs without using Rust syntax but using Rust tokens, which wouldn’t be possible with OCaml since I only get a well formed AST or nothing.

I also do agree that maybe I should just put effort in improving metaquotation and reducing boilerplate for ppxlib. One has to admire the simplicity of

#[proc_macro]
fn macro_name(input: TokenStream) -> TokenStream { … }

And absolutely nothing else

Chet_Murthy · August 21, 2025, 11:20pm

In case you have those examples lying around, I would very much like to see them. As I wrote, I thought about implementing rust style macros as part of camlp5, but couldn’t think I’d enough useful examples. So having some would be most excellent.

They can be in rust – and even really sketchy – I just need to see the idea.

–chet–
no one leaves home unless
home is the mouth of a shark

Chet_Murthy · August 22, 2025, 2:26am

Oh interesting, I never thought about this. But I don’t think that’s fully true. In the past, I wrote extensive macros in Rust to define small DSLs without using Rust syntax but using Rust tokens, which wouldn’t be possible with OCaml since I only get a well formed AST or nothing.

I was very impressed with Rust’s macro system – their tokentrees and such. But seriously, if you would be willing to provide the -barest- sketches of what you did in Rust, I would be happy to try to do the same in Camlp5 as-is, and then if that wasn’t … commodious and pleasant, to implement something like Rust’s macros to make it commodious and pleasant.

Thing is, without motivating examples, it’s tough to get geared-up to do anything. And lately, I’m … not very motivated. It’s the world situation [we mustn’t say anything about that in this forum, so I’ll leave it there.]

lukstafi · August 22, 2025, 8:57am

I think you might be correct: it makes no difference because the LSP is only aware of the post-transformation representation. Does ocamlformat work with the pre-transformation or the post-transformation syntax tree? I actually don’t know. But if it works with post-transformation, our syntax extensions better be really really good at tracking locations between source and output; it’s easy to get lazy and use a single location for a whole generated subtree. One point would ultimately remain: giving users some degree of syntax familiarity, for example things like operator precedence.

EmileTrotignon · August 22, 2025, 10:08am

pre-transformation or the post-transformation syntax tree

It runs with pre-ppx ast. It has to, because the text that is being formatted is the source.

lukstafi · August 22, 2025, 10:25am

So moving to token streams would disable ocamlformat for the code in extended syntaxes, a significant downside.

EmileTrotignon · August 22, 2025, 12:02pm

You could write something like ocp indent that would work on the token stream, or maybe have some sort of modular architecture were the syntax extension itself is responsible for formatting.

Gopiandcode · August 22, 2025, 2:52pm

I don’t want to sound like a broken record here, but this is something lean somewhat has~

github.com/kiranandcode/cleango

Clingo/Lang.lean

da826f391


      
                 (<- mkAstStatementDataRule head [])
          | `(clingo_statement| :- $bodyStx:clingo_body_literal_seq .) => do
              let body <- build_clingo_body_literal_seq [] bodyStx
              let head <- mkHeadLiteralDefaultFalse bodyStx
              mkAstStatement bodyStx
                 (<- mkAstStatementDataRule head body)
          
          
          def build_term (t: Syntax): TSyntax `term := TSyntax.mk t
          
          syntax "add_clingo_query!" term ":" (colGt clingo_statement)*  : term
          macro_rules 
          | `(term| add_clingo_query! $control:term : $cs:clingo_statement*) =>
              `(($control : Clingo.Control).withProgramBuilder fun pb => do
                  pb.addStatements [$(Syntax.TSepArray.ofElems <| cs.raw.map build_term),*]
                  return ()
               )
          
          syntax "clingo_term!" "(" clingo_term ")" : term
          elab_rules : term
          | `(term| clingo_term!($s:clingo_term)) => do

When declaring syntax-extensions to the parser, the library provides a number of no-op combinators that are specifically only relevant for pretty printing – things like ppspace, or newline etc.

I guess it’s not an exact one to one comparison as Lean grammars can be indentation sensitive so formatting is often part of the parsing process, but it’s another cool feature of macros in Lean I’ve really found a breath of fresh air.

giltho · August 26, 2025, 9:33am

Regarding performances, I was curious to try and compare Lean and OCaml on one small micro-benchmark that is somewhat relevant to my use-case.

I use Iter.t very extensively as my main “nondeterminism” abstraction. I can’t do this in Lean, but I can use an equivalent of Seq.t (which I had to implement myself, basically copying the OCaml implementation).

I then wrote the following test in Lean:


def l : LazyList Nat := do
  let x <- LazyList.init (fun i => i) 100000
  let y <- LazyList.cons x (LazyList.cons (x + 1) (LazyList.singleton (x + 2)))
  pure (y * 2)


def main : IO Unit :=
  let sum := LazyList.foldl (fun a b => a + b) 0 l
  IO.println sum

and OCaml:

let l =
  let ( let* ) x f = Seq.flat_map f x in
  let* x = Seq.init 100000 (fun x -> x) in
  let* y = Seq.cons x (Seq.cons (x + 1) (Seq.return (x + 2 ))) in
  Seq.return (y * 2)

let () =
  let sum = Seq.fold_left ( + ) 0 l in
  Printf.printf "%d\n" sum

And…
The Lean4 binary is ~10 times slower than the OCaml one.
Is there a way to make an optimised build in Lean4 that I am missing? Mind you I haven’t run any optimiser on the OCaml binary, no flambda, no -O3

In the current state, I would definitely not call Lean4 “surprisingly fast”, maybe “surprisingly fast for a theorem prover code”

silene · August 26, 2025, 12:39pm

You are comparing arbitrary-large integers on the Lean side (Nat) with machine integers on the OCaml side (int), so that is a completely unfair benchmark. You should either replace Nat with UInt64 for Lean or do open Z at the start for OCaml, so that roughly the same arithmetic operations are performed on both sides. (That said, I still expect OCaml to win by a large margin, but not by a factor 10.)

giltho · August 26, 2025, 1:12pm

Good point. That being said:

Changed OCaml’s ints to Z, no observable performance change, so still 10x faster
Changed Lean’s Nat to UInt64, still no observable performance change

So no, integers are not the issue here.

silene · August 26, 2025, 2:12pm

Your Lean implementation of LazyList.init is not quite right. Notice how the go function is fully applied. As a consequence, instead of creating a single closure, you are actually creating a list of 100’000 closures, which then has to be visited.

Also, if the performance of your code does not change when you switch between machine integers and large integers, then it is doubtful that you are actually measuring the ability of the compiler to properly optimize (or not) the code. Rather, you are measuring the overhead of the runtime. So, you might want to consider a different benchmark. (But if your point was to measure on a toy example the difference between both models of memory allocation and garbage collection, then it might be fine, due to the sheer amount of temporary objects.)

giltho · August 26, 2025, 3:32pm

I mean yeah this is only a tiny micro-benchmark, but Lean code makes very extensive use of monads, and if it’s not great at creating closures I’m pretty worried about performances in general.

I reimplemented init as follows (which I believe doesn’t suffer from the flaw you pointed out) and it did not change the speed at all.

def init (f : Nat -> α) (n : Nat) : LazyList α :=
  let rec go (i : Nat) (_: Unit) : LazyListNode α :=
    if i < n then LazyListNode.cons (f i) (LazyList.mk (go (i + 1))) else LazyListNode.nil
  LazyList.mk (go 0)

I do agree that this definitely needs better benchmarks, and I was curious to reimplement my latest project from OCaml to Lean but I got ultra-bored from the headache of threading an IO monad everywhere to be able to call communicate with an external process or write logs

aguluman · August 26, 2025, 4:41pm

All caught up on this thread.

Didn’t understand a bunch of stuff here yet(web developer)

But iit seemed difficult, or maybe its my lack of knowledge at the time.

Cheers all.

Topic		Replies	Views
Advantages of OCaml over Rust Community	76	32541	September 4, 2019
What I dislike about OCaml Community ocaml	117	12710	November 5, 2022
OCaml - first impressions Learning	26	2399	September 20, 2020
Next priority for OCaml? Community	124	9294	September 5, 2023
Why did you learn OCaml? Community	144	16076	November 7, 2025

Why Lean 4 replaced OCaml as my Primary Language

Related topics