My colleagues did some extensive benchmarking before the first release of flambda (back in 4.03) to find reasonable settings for the default, -O2 and -O3 flags (note that there is no -O1 or -O0 flags).
The current choice for default is still considered the best one: it means compilation times are comparable to the non-flambda (Closure) version, while generated code is roughly similar in performance.
In my opinion, if you want to make -O3 the default, this shouldn’t be a patch to the compiler or the opam repository, but rather to the build systems: dune could pass the -O3 flag to ocamlopt when in release mode, for instance.
Speaking of the numbers, I am seeing improvements between 10% and 30% that are mostly coming from significantly reduced pressure on the garbage collector.
OK, so after recompiling everything with -O3, I’ve got about 40% improvement in such simple task as binary disassembling, which is fantastic, nearly twice as fast. On more complex benchmarks the results are less impressive, something between 10% to 20%.
Thanks a lot for the report, this is error is due to a Change in the backtrace format introduced in #9096. I will try to propose a patch to update oUnit adhoc backtraces parser.
Also note that if there are any other blockers, I encourage you to also report them on the opam-repository readiness issue. It would help to document what are the constraints and blockers of downstream distributions.
@vlaviron@octachron, I would be very interested to read what you might think of the proposal to use an opam config variable so that individual users could more easily enable higher optimization levels for an entire switch. I have not had a chance to test this approach, but presuming that it plays nicely with opam, would that step already be too far in your view? It seems to me to be premature to change the default, but I also think that it would be beneficial to make it easier for people to test with a whole switch that is as highly optimized as possible.
It also seems possible that the optimal point on the compilation time vs produced executable speed might be different when building dependencies in a switch versus direct compilation of a project. That is, at least hopefully, we compile each library in each switch rather less often than we compile the projects we are developing. So it may make sense for the default when opam is building dependencies to opt for a higher optimization level than the default that is set for builds in the edit-typecheck-compile-debug cycle.
If it works nicely with opam, it sounds like a good way to provide more control on the optimization options. But the stateful nature of this option seems a bit worrying: my memories from Gentoo is that tracking bugs that only appear at specific optimisation levels is not fun. Even more so when the optimisation level is implicit and may vary between packages.
I don’t mind at all if it gets more people to try different flambda options, but my advice is to always consider optimisation flags on a package or project basis, not blindingly on a whole opam switch.
Although I’ll admit that it could be a nice idea to have a specified opam config variable for the required optimisation level (I wouldn’t use numbers though, but names like “default”, “agressive” and “very aggressive” for instance), in my opinions it would be the job of each package to decide how to translate those hints into actual compilation flags. I would expect build systems to help by providing reasonable defaults though, and the -O2 and -O3 flags were designed explicitly in the hope that they would be good defaults.
But that’s kind of the problem: -O3 is very far from the highest settings that you can give, it was picked as a reasonable compromised for people willing to spend a non-trivial amount of extra time compiling for hopefully better runtime performance. By tweaking the flambda parameters you can set bigger inlining thresholds, or more inlining rounds, but whether it is useful or not will highly depend on the specifics of the code being compiled.
To me, this corresponds to dune profiles, and would deserve to be discussed with the dune maintainers. The -O2 and -O3 options are ignored by the non-flambda compiler but do not cause an error, so you may want to see if they can be convinced to add one of them by default in the release profile.