Why Lean 4 replaced OCaml as my Primary Language

yes but not anywhere right ? Like there is no block just to introduce a new scope, they all do something specific, and there is the statement vs expression split

Sure, C does have separate grammars for statements and expressions, but that it is completely orthogonal to hygiene. Even if the ({ ... }) syntax extension was eventually standardized in C (it is long overdue), it would not change the fact that an unhygienic macro system is unhygienic.

2 Likes

Thank you for the write-up @Gopiandcode. While I agree with some of the people here that some parts of the write-up do not fully carry the ideas, I still think that I agree with most of the issues you have with OCaml, and I think they should not be dismissed because the blog post is not as carefully written as a scientific paper would :sweat_smile: It’s a blog post, it’s not meant to be perfect

Specifically:

  • Lack of optimisations: I do agree, but I’m also ok with what I have to be fair, my applications don’t need more perfs, OCaml’s already very performant as it is. Also I’m happy the fact that, when I look at places, I can optimise hot paths should I want to. I was told that haskell is particularly frustrating in that spending time optimising code can sometimes have negative effects.
  • Conservative release cycle: I 95% agree with this, up to the point that there are very few people that have the skills and time to review things like the Stdlib, and they do review and make the progress they can. I also wished we had features like modular implicits by now, but we don’t and it is what it is. I already think that effects are a massive quality of life improvment (even without an effect system).
  • Opinionated build system: that one I disagree. Nothing prevents you from calling a makefile from Dune. Writing your own rules to produce text is not difficult. I do agree that the ecosystem as a whole is too full of paper-cuts, but I can live with a not-perfect ecosystem to use a great language.

One mischaracterisation of your post is that the first PR was closed because of naming debates, that is not correct as far as I understand. The first PR was closed because the structure had an unnecessary indirection in each element. I did get frustrated by the fact that, by the time Dynarray was merged, OCaml 5 had come out and they ended up converging with
 the implementation with an indirection but :person_shrugging:

Now the one opinion I really agree with is the problem of meta-programming. I don’t need all the power of Lean’s meta-programming, as @yawaramin said: most use cases need simple macros, and I completely cannot write these without resorting to a ppx, which is way too unwieldy for writing simple wrappers. Hygienic macros written directly in Rust are the one feature I really really envy from Rust. The second is the tokenstream-based macro interface, instead of AST-based interface, which makes everything much more stable.

5 Likes

I like the AST-based interface, as it makes syntax extensions work better with other tools: LSP, ocamlformat.

A few years ago, while working heavily with Rust (for IBM Quantum), I did some analysis to figure out what it would take to implement Rust-style macros for OCaml. I maintain and use Camlp5, so it wouldn’t be very hard to hack together the parsing and AST support. But I realized pretty quickly that anything other than trivial uses, such a facility would be useless, b/c too often macros depend on type-directed compilation (that is to say, “typeclasses”/”modular implicits”). E.g., the “debug-print” macro.

1 Like

As I mentioned in my comment up-thread, you can get this in OCaml with a sufficiently robust meta-quotation facility. What there is today is not “sufficiently robust”, b/c macro-writers do not use it almost-exclusively for AST manipulation.

I think it would be a useful goal for somebody to decide to make metaquotation sufficiently powerful that most uses of direct AST manipulation disappeared in PPX rewriters.

I will note that one of Daniel de Rauglaudre’s original insights that drove Camlp5 (originally Chamau) was that the homo-iconicity that made LISP/Scheme macros so effective, had its counterpart in ML, not in direct AST manipulation, but in powerful meta/anti-quotation facilities. One can throw away a lot of Camlp5 and replace it with more-standard stuff, but this part remains as far more powerful than anything else on offer in the OCaml ecosystem.

Not quite sure what it helps with in these cases

Oh interesting, I never thought about this. But I don’t think that’s fully true. In the past, I wrote extensive macros in Rust to define small DSLs without using Rust syntax but using Rust tokens, which wouldn’t be possible with OCaml since I only get a well formed AST or nothing.

I also do agree that maybe I should just put effort in improving metaquotation and reducing boilerplate for ppxlib. One has to admire the simplicity of

#[proc_macro]
fn macro_name(input: TokenStream) -> TokenStream { 
 }

And absolutely nothing else

In case you have those examples lying around, I would very much like to see them. As I wrote, I thought about implementing rust style macros as part of camlp5, but couldn’t think I’d enough useful examples. So having some would be most excellent.

They can be in rust – and even really sketchy – I just need to see the idea.

–chet–
no one leaves home unless
home is the mouth of a shark

Oh interesting, I never thought about this. But I don’t think that’s fully true. In the past, I wrote extensive macros in Rust to define small DSLs without using Rust syntax but using Rust tokens, which wouldn’t be possible with OCaml since I only get a well formed AST or nothing.

I was very impressed with Rust’s macro system – their tokentrees and such. But seriously, if you would be willing to provide the -barest- sketches of what you did in Rust, I would be happy to try to do the same in Camlp5 as-is, and then if that wasn’t 
 commodious and pleasant, to implement something like Rust’s macros to make it commodious and pleasant.

Thing is, without motivating examples, it’s tough to get geared-up to do anything. And lately, I’m 
 not very motivated. It’s the world situation [we mustn’t say anything about that in this forum, so I’ll leave it there.]

2 Likes

I think you might be correct: it makes no difference because the LSP is only aware of the post-transformation representation. Does ocamlformat work with the pre-transformation or the post-transformation syntax tree? I actually don’t know. But if it works with post-transformation, our syntax extensions better be really really good at tracking locations between source and output; it’s easy to get lazy and use a single location for a whole generated subtree. One point would ultimately remain: giving users some degree of syntax familiarity, for example things like operator precedence.

pre-transformation or the post-transformation syntax tree

It runs with pre-ppx ast. It has to, because the text that is being formatted is the source.

1 Like

So moving to token streams would disable ocamlformat for the code in extended syntaxes, a significant downside.

You could write something like ocp indent that would work on the token stream, or maybe have some sort of modular architecture were the syntax extension itself is responsible for formatting.

I don’t want to sound like a broken record here, but this is something lean somewhat has~

When declaring syntax-extensions to the parser, the library provides a number of no-op combinators that are specifically only relevant for pretty printing – things like ppspace, or newline etc.

I guess it’s not an exact one to one comparison as Lean grammars can be indentation sensitive so formatting is often part of the parsing process, but it’s another cool feature of macros in Lean I’ve really found a breath of fresh air.

2 Likes

Regarding performances, I was curious to try and compare Lean and OCaml on one small micro-benchmark that is somewhat relevant to my use-case.

I use Iter.t very extensively as my main “nondeterminism” abstraction. I can’t do this in Lean, but I can use an equivalent of Seq.t (which I had to implement myself, basically copying the OCaml implementation).

I then wrote the following test in Lean:


def l : LazyList Nat := do
  let x <- LazyList.init (fun i => i) 100000
  let y <- LazyList.cons x (LazyList.cons (x + 1) (LazyList.singleton (x + 2)))
  pure (y * 2)


def main : IO Unit :=
  let sum := LazyList.foldl (fun a b => a + b) 0 l
  IO.println sum

and OCaml:

let l =
  let ( let* ) x f = Seq.flat_map f x in
  let* x = Seq.init 100000 (fun x -> x) in
  let* y = Seq.cons x (Seq.cons (x + 1) (Seq.return (x + 2 ))) in
  Seq.return (y * 2)

let () =
  let sum = Seq.fold_left ( + ) 0 l in
  Printf.printf "%d\n" sum

And
 :drum:
The Lean4 binary is ~10 times slower than the OCaml one.
Is there a way to make an optimised build in Lean4 that I am missing? Mind you I haven’t run any optimiser on the OCaml binary, no flambda, no -O3

In the current state, I would definitely not call Lean4 “surprisingly fast”, maybe “surprisingly fast for a theorem prover code”

1 Like

You are comparing arbitrary-large integers on the Lean side (Nat) with machine integers on the OCaml side (int), so that is a completely unfair benchmark. You should either replace Nat with UInt64 for Lean or do open Z at the start for OCaml, so that roughly the same arithmetic operations are performed on both sides. (That said, I still expect OCaml to win by a large margin, but not by a factor 10.)

Good point. That being said:

  • Changed OCaml’s ints to Z, no observable performance change, so still 10x faster
  • Changed Lean’s Nat to UInt64, still no observable performance change

So no, integers are not the issue here.

1 Like

Your Lean implementation of LazyList.init is not quite right. Notice how the go function is fully applied. As a consequence, instead of creating a single closure, you are actually creating a list of 100’000 closures, which then has to be visited.

Also, if the performance of your code does not change when you switch between machine integers and large integers, then it is doubtful that you are actually measuring the ability of the compiler to properly optimize (or not) the code. Rather, you are measuring the overhead of the runtime. So, you might want to consider a different benchmark. (But if your point was to measure on a toy example the difference between both models of memory allocation and garbage collection, then it might be fine, due to the sheer amount of temporary objects.)

1 Like

I mean yeah this is only a tiny micro-benchmark, but Lean code makes very extensive use of monads, and if it’s not great at creating closures I’m pretty worried about performances in general.

I reimplemented init as follows (which I believe doesn’t suffer from the flaw you pointed out) and it did not change the speed at all.

def init (f : Nat -> α) (n : Nat) : LazyList α :=
  let rec go (i : Nat) (_: Unit) : LazyListNode α :=
    if i < n then LazyListNode.cons (f i) (LazyList.mk (go (i + 1))) else LazyListNode.nil
  LazyList.mk (go 0)

I do agree that this definitely needs better benchmarks, and I was curious to reimplement my latest project from OCaml to Lean but I got ultra-bored from the headache of threading an IO monad everywhere to be able to call communicate with an external process or write logs :upside_down_face:

All caught up on this thread.

Didn’t understand a bunch of stuff here yet(web developer)

But iit seemed difficult, or maybe its my lack of knowledge at the time.

Cheers all.