We at Imandra just open sourced minisat-ml , a reimplementation of Minisat 2.2 in OCaml. The goal is to experiment with the performance difference between C++ and (low level) OCaml that stays as close as possible to the original.
The report provides more detailed explanations on the porting process, as well as some experimental results. Personally I think it’d also be an interesting sample point for flambda and other performance tools of the ecosystem, since minisat 2.2 is a fixed target in a field known to require very high performance.
This is an very interesting benchmark with an encouraging result IMO. Is it correct to say that minisat-ml takes roughly twice the time of minisat? I found the results a bit hard to interpret because the averages are computed based on a different set of solved problems due to the constant timeout. I guess it would be easier if the set of problems were constant.
Another question: Did this exercise yield a short cookbook of basic rules for performance sensitive modern OCaml that uses flambda? I’d love to have something like that… Specifically, are most of the let[@inline] annotations in the source necessary or were they added ‘just in case’?
It’s actually slower than 2x minisat (my impression on the easier benchmarks is that it’s roughly 3 to 5x slower when behavior is the same). Not sure how to render that properly in graphs, suggestions are welcome (some csv files are in the repo, too).
For the performance sensitive code, I’ve just used my own experience, tried to avoid allocations, used [@inline] quite a bit “just in case” (to ensure small helpers disappear), and some ocamlopt options (-O3 -unbox-closures -unbox-closures-factor 20). There are probably things I have overlooked. Some for loops were added after profiling because tail-rec inner functions would appear to be bottlenecks.
I use perf for profiling, combined with flamegraph to actually see what’s going on (see src/scripts/profile.sh). For example, it shows that the time spent in boolean propagations is in the range of SAT solving folklore (~80% of the time spent there). perf does a good job but it’s a bit hard sometimes to assess the impact of caml_modify and the GC with it.