After three alpha releases, we have created a first beta version to help you
adapt your software to the new features ahead of the release.
The compatibility of the opam ecosystem with OCaml 4.11.0 is currently quite
good, and it should be possible to test this beta without too much trouble.
Compared to the last alpha release, this first beta release contains the following new bug fixes:
Driver
#9011: Allow linking .cmxa files with no units on MSVC by not requiring the
.lib file to be present.
(David Allsopp, report by Dimitry Bely, review by Xavier Leroy)
Typechecker
#9384, #9385: Fix copy scope bugs in substitutions
(Leo White, review by Thomas Refis, report by Nick Roberts)
#9695, #9702: no error when opening an alias to a missing module
(Jacques Garrigue, report and review by Gabriel Scherer)
Warnings
#7897, #9537: Fix warning 38 for rebound extension constructors
(Leo White, review by Florian Angeletti)
#9244: Fix some missing usage warnings
(Leo White, review by Florian Angeletti)
Toplevel
#9415: Treat open struct as include struct in toplevel
(Leo White, review by Thomas Refis)
#9416: Avoid warning 58 in flambda ocamlnat
(Leo White, review by Florian Angeletti)
Flambda backend
#9163: Treat loops properly in un_anf
(Leo White, review by Mark Shinwell, Pierre Chambart and Vincent Laviron)
My colleagues did some extensive benchmarking before the first release of flambda (back in 4.03) to find reasonable settings for the default, -O2 and -O3 flags (note that there is no -O1 or -O0 flags).
The current choice for default is still considered the best one: it means compilation times are comparable to the non-flambda (Closure) version, while generated code is roughly similar in performance.
In my opinion, if you want to make -O3 the default, this shouldn’t be a patch to the compiler or the opam repository, but rather to the build systems: dune could pass the -O3 flag to ocamlopt when in release mode, for instance.
A lot has changed since 4.03 though. And BAP project experience confirms there is a huge performance boost for non-trivial code - [ANN] BAP 2.1.0 Release - #9 by ivg :
Speaking of the numbers, I am seeing improvements between 10% and 30% that are mostly coming from significantly reduced pressure on the garbage collector.
OK, so after recompiling everything with -O3, I’ve got about 40% improvement in such simple task as binary disassembling, which is fantastic, nearly twice as fast. On more complex benchmarks the results are less impressive, something between 10% to 20%.
Enabling by force the -O3 option in -flambda switches without further testing seems far too early to me.
It seems to me that this option could be tested first in an experimental opam repository and tested for one release cycle before being proposed as the default option for flambda switches.
This ounit issue blocks openSUSE due to this build dependency chain: ounit -> re -> ppxlib -> rest of ocaml pkgs. Since make check should run, an outdated ounit is fatal.
Thanks a lot for the report, this is error is due to a Change in the backtrace format introduced in #9096. I will try to propose a patch to update oUnit adhoc backtraces parser.
Also note that if there are any other blockers, I encourage you to also report them on the opam-repository readiness issue. It would help to document what are the constraints and blockers of downstream distributions.
@vlaviron@octachron, I would be very interested to read what you might think of the proposal to use an opam config variable so that individual users could more easily enable higher optimization levels for an entire switch. I have not had a chance to test this approach, but presuming that it plays nicely with opam, would that step already be too far in your view? It seems to me to be premature to change the default, but I also think that it would be beneficial to make it easier for people to test with a whole switch that is as highly optimized as possible.
It also seems possible that the optimal point on the compilation time vs produced executable speed might be different when building dependencies in a switch versus direct compilation of a project. That is, at least hopefully, we compile each library in each switch rather less often than we compile the projects we are developing. So it may make sense for the default when opam is building dependencies to opt for a higher optimization level than the default that is set for builds in the edit-typecheck-compile-debug cycle.
If it works nicely with opam, it sounds like a good way to provide more control on the optimization options. But the stateful nature of this option seems a bit worrying: my memories from Gentoo is that tracking bugs that only appear at specific optimisation levels is not fun. Even more so when the optimisation level is implicit and may vary between packages.
I don’t mind at all if it gets more people to try different flambda options, but my advice is to always consider optimisation flags on a package or project basis, not blindingly on a whole opam switch.
Although I’ll admit that it could be a nice idea to have a specified opam config variable for the required optimisation level (I wouldn’t use numbers though, but names like “default”, “agressive” and “very aggressive” for instance), in my opinions it would be the job of each package to decide how to translate those hints into actual compilation flags. I would expect build systems to help by providing reasonable defaults though, and the -O2 and -O3 flags were designed explicitly in the hope that they would be good defaults.
But that’s kind of the problem: -O3 is very far from the highest settings that you can give, it was picked as a reasonable compromised for people willing to spend a non-trivial amount of extra time compiling for hopefully better runtime performance. By tweaking the flambda parameters you can set bigger inlining thresholds, or more inlining rounds, but whether it is useful or not will highly depend on the specifics of the code being compiled.
To me, this corresponds to dune profiles, and would deserve to be discussed with the dune maintainers. The -O2 and -O3 options are ignored by the non-flambda compiler but do not cause an error, so you may want to see if they can be convinced to add one of them by default in the release profile.