Sandmark Nightly is a service for the OCaml compiler developers that helps benchmark the development branches of the compiler on the large set of Sandmark benchmarks on tuned machines and reports the results in an interactive UI. Currently, Sandmark nightly reported many metrics including running time. But running time is a notoriously noisy metric on modern architectures due to the effects of modern OS, arch and micro-arch optimisations. There could be swings of 50% in either directions when the directory in which the program is run changes (STABILIZER. Statistically Sound Performance Evalutaion).
While we run Sandmark benchmarks on tuned machines, we still see impact of noise on the real time measurements. To this end, we introduce a new metric into Sandmark nightly that in addition to real time, would help interpret the results accounting for the noise. We now report “instructions retired” for Sandmark runs. Instructions retired report the number of instructions executed by the program during its run and hence is shielded from the noise that affects real time measurements. Of course, the same number of instructions may be discharged at different rates by the processor due to instruction-level parallelism and hence, the instructions discharged metric should be used in conjunction with real time measurements. For example, if a new compiler feature adds 2 instructions to the prolog of the function, then the instructions retired metric should inform you how many new instructions are actually executed on top of the baseline.
The instructions retired metric is collected from perf
command. We also have other useful metrics from perf such as page faults, branches, branch misses, cache misses at various levels of the hierarchy, etc. We will add graphs to report these going forward. Enjoy the new feature, and as ever, report issues and bugs to Sandmark Issues.
The web service is available at https://sandmark.tarides.com and you can select the Perfstat Output
radio button on the left panel as shown below.
After selecting the variants and a baseline for comparison, you can view the normalised instructions per cycle
change as illustrated below:
You can also request for your development branches to be added to the Sandmark Nightly Service at the sandmark-nightly-config repository for the nightly runs.
References
-
Emery Berger and Charlie Curtsinger. STABILIZER. Statistically Sound Performance Evaluation.
-
Run perfstat with Sandmark nightly service. Sandmark PR #394
-
Add webpage with perfstat output from Sandmark. Sandmark-nightly PR #81
-
perfstat.ipynb. Sandmark perfstat Jupyter notebook