Looking at OCaml in the benchmarks game over the past few months

So, I follow the benchmarks game pretty pedantically (for Julia-related reasons—which has made some very impressive gains in in the past year). I realize general benchmarks like this are junk, but OCaml’s performance seems to have taken a hit in the past few months, and I’m wondering wny that might be.

Here’s the results from December: https://web.archive.org/web/20200126145108/https://benchmarksgame-team.pages.debian.net/benchmarksgame/which-programs-are-fastest.html
A decent showing for OCaml.

Here’s the results March: https://benchmarksgame-team.pages.debian.net/benchmarksgame/which-programs-are-fastest.html

It seems to have fallen behind Haskell, SBCL (Lisp) and (glup!) Node.js in the most recent benchmarks.

I will grant that many of these benchmark programs have freaky optimizations and are implemented by people with intimate knowledge of the runtimes of the respective language, but I was still a little surprised to see JavaScript beating OCaml on any benchmark, and I began to wonder if we aren’t seeing some performance penalties for the multicore support that is being merged into mainline OCaml.

On the other hand, the benchmarks I run personally are fine, so maybe I’m reading too much into this? Maybe the Haskell and JS engineers are writing benchmark programs while the OCaml engineers have other interests?

I dunno. I’m sure it’s fine. Just asking.

3 Likes

I can’t answer most of this question. However, concerning multicore, AFAIK, the multicore changes merged so far shouldn’t have an impact on performance, as they are mainly some refactoring and reorganization of top-level runtime data structures in order to support more substantial changes later.

4 Likes

Thanks for the info!

If that’s what you think: don’t look and don’t ask others to look :wink:

Sep 12, 2019, all OCaml version 4.08.0 programs & measurements

Current, all OCaml version 4.10.0 programs & measurements

As-of March 29 2020, the measurements seem much the same.

1 Like

I can’t stop looking! I check it religiously every month! It’s pathological. I started looking because of Julia, because it’s a performance oriented language, so it’s kind of important for public perception that it looks good in a fairly well-known benchmarks site. (It wasn’t doing so hot at the beginning of 2019, but it’s looking great now)

Thanks for the reassurance. I the GHC, SBCL and Node benchmarks must have been through some rewrites or something. The C++, C and Rust benchmarks have also been shuffling places regularly in the past few months, so the fastest benchmarks are probably getting faster as well.

I’ve also been kind of paranoid about the effect multicore OCaml was going to have in sequential code—being someone who prefers process-based parallelism anyway.

Good to know this is just my imagination running away with me.

Multicore may eventually affect performance of benchmarks and/or “real-world” sequential code, it just hasn’t done so yet.

I know! That’s why I’m so paranoid.

There’s a simple reason: someone has done the work to write faster programs —

I tend to work on in in ~20 minute spurts as a way of relaxing. For this one, I’d say I probably spent 2-3 hours.

I will say this, the overall amount of time I’ve spent on writing the Julia benchmarks is far higher than it would have been for someone already familiar with writing high performance code. It’s been a great learning experience for me.

You could try to write better OCaml programs and contribute them to the benchmarks game.

If you know it’s going to happen, no use worrying about it, right?

1 Like

Do you know if ocaml multicore will benefit some of these benchmarks or are all these purely sequential?

Should help. If you look at individual benchmarks on the site, they give information about CPU usage, and most of the programs can be parallelized. Some of the OCaml benchmarks are already running on multiple cores, I assume using multiprocessing, but many of them are running on a single core where the F# versions are parallel:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/ocaml.html

There are a couple benchmarks that can’t be effectively parallelized (on purpose). n-body and pidigits, I think are the ones.

That is one important reason, but there has also been a lot of work in recent releases to get the JIT warmup time down—and it’s definitely had an impact as well. More efficient programs are definitely the largest factor, but the Julia platform isn’t standing still in terms of performance either.

As far as me contributing benchmarks, I could contribute, but I doubt mine would be any faster! I can follow best practices when it comes to avoiding unnecessary work as well as the next person, but I’m no performance engineer. I’m just an enthusiastic spectator of this silly website.

How do you know ?

How do you know timings for those benchmarks programs changed with new versions of Julia lang ?

If the timings did change, how do you know that was due to “JIT warmup time” and not some other Julia lang performance improvement ?

Guys, given that this is now getting off-topic I would recommend taking this specific question into some other forum. Cheers

Not that I typically cared about these kinds of benchmarks, but it seemed strange to me that OCaml could possibly perform less efficiently than some of these languages. I’ve just picked one benchmark that caught my eye (pidigits), where OCaml needed 11.04 seconds against Haskell’s 4.22 seconds. That seemed preposterous given that both languages should be using GMP and spend most of their time in that library.

Running this benchmark on my computer for both Haskell and OCaml, I get considerably better numbers: OCaml needed about 2.45s vs Haskell’s 3.1s, which is what I would have expected. This suggests a serious problem with the benchmark. One more reason to ignore it :slight_smile:

8 Likes

I’m not sure about that. Getting different results on a different machine simply tells you that things are complex. One or the other specific result can’t tell you much. The benchmarks would ideally have to be run on a range of hardware.

However, I will agree that, in my experience of writing “normal” (or perhaps worse than normal!) programs in OCaml, it is much faster than many of the languages beating it in this benchmark. Heavily optimized code in any language isn’t necessary representative the performance of idiomatic code. Idiomatic OCaml is very fast.

1 Like

My experience with OCaml is that it is much quicker than it would appear on the shootout site. The Julia community has definitely gone to town on these benchmarks since last I checked! Take a look at the Julia version of spectral-norm. It overloads + and / with injected LLVM IR which is cool… however I doubt the average Julia user (scientist/engineer) would be coding this way regularly. It also uses macros to make it parallel and drop bounds checking. Arguably more accessible to normal Julia programmer but you first need to be know what bounds checking is and why there is an overhead to even consider using that particular macro ;-).

For the pidigits example almost all (or all) the programs are using GMP and if so this benchmark is essentially testing the overhead of the FFI. This is a strength of Julia as there is no overhead (effectively the same as calling a C function from C which is basically none).

2 Likes

I’ll grant that injecting IR is pretty fancy stuff (though I have seen it in some of Julia’s big libraries). Using macros to parallelize code and drop bounds checking is pretty standard stuff, from what I’ve seen. It’s mentioned in the performance tips section of the manual.

But yeah, no denying that a ton of work has gone into getting these benchmarks where they currently are.

it’s an antipattern.

These aren’t the the droids you’re looking for. This is about rolling your own macros to try and get code inlined, like one might in C. The example specifically suggests using @inline instead of rolling your own macro.

Also, the bundled performance macros aren’t, uh, “normal” macros. They are more like compiler directives that use macro syntax. It wouldn’t be possible to implement them in regular user code.

And these “macros” are used constantly in the implementation Julia’s standard library.