Tagging integers on Intel64 processors

For x86-64, ocamlopt emits code like leaq 1(%rax,%rax),%rax for tagging integers.

However, for all Intel Core processors since 2011, leaq 1(,%rax,2),%rax would be better.

Comparison:

  • PRO: shorter latency (1 cycle vs. 3 cycles)
  • PRO: higher throughput (2 per cycle vs. 1 per cycle)
  • CON: larger encoding (8 bytes vs. 5 bytes)

For AMD “Ryzen” processors, the above is not an optimization (same latency and throughput).

Please consider adapting the code generator for Intel64 in ocamlopt. Thank you!

PS. For details on the timing, see https://www.agner.org/optimize/

For requests involving the compiler, please open an issue at https://github.com/ocaml/ocaml/issues

Regarding the request itself, as a general rule the compiler team has historically avoided optimizations that depend on specific processor models, as it adds testing burden (more code paths to check), complexity (need to check for specific models in the code generator), etc.

Cheers,
Nicolás

Will do. And I will try to address the issues you mentioned. Very helpful!
Thank you for your reply.

I’d prefer it if participating in the development didn’t require a Github account. I’m sure the issues are known to most here.