Performance of `Printf.sprintf`

With core_bench I’ve discovered a hot path in my code, namely formatting a date to string was rather slow with Calendar (https://github.com/ocaml-community/calendar/issues/24). Then I went searching for some faster implementations and I found Ptime and Core strftime.
The conclusion is that core is by far the fastest, so I would like to use that. But a problem that I have is that I want support for Windows as well, so Core seems out (https://github.com/janestreet/core/issues/84).

So, because I only need fixed and basic formatting I was thinking of just rolling out my own implementation:

let format = ({tm_year, tm_mon, tm_mday, tm_hour, tm_min, tm_sec}: Unix.tm) => {
  Printf.sprintf(
    "%d-%02d-%02dT%02d:%02d:%02dZ",
    1900 + tm_year,
    tm_mon + 1,
    tm_mday,
    tm_hour,
    tm_min,
    tm_sec,
  );
};

But this is still between 3-4 times slower than Core. I’m not sure why that is the case? Maybe because Core delegates to a native c strftime?
Still I had not expected such a large difference.

Is there anything I can do to close the gap?

Or is there an option with dune to use Core on linux and mac, but my own implementation on windows?

Thanks

2 Likes

Yes, Core just calls the C strftime function (source). So, you could do something similar on Windows, presumably. I don’t know what the Windows C stubs would look like, unfortunately.

Are there possible optimisations to my code that could get the performance closer to strftime. I tested creating the string myself with concatenation but that is similar or slower than sprintf.
To be honest I had expected the difference to be a bit smaller. it’s not too bad but I’m formatting millions of dates so it’s adding up.

Did you try to create a Bytes.t and write in-place the date? Your format looks like it is mostly fixed size, and you can probably afford a slow branch for years outside of the [1000,9999] interval.

I haven’t tried with bytes, so will try that. Not sure what what you mean with write the date in-place. I’m only interested in dates in the 2015-2025 range.

I meant something like:

let memory = Bytes.of_string "2015-12-31T23:58:59Z"
let set pos int = Bytes.unsafe_set memory pos (Char.code (48+int))
let print year month day h m s =
  set 2 (year/10 mod 10);
  set 3 (year mod 10);
  ...;
  Bytes.to_string memory
3 Likes

Wow thank you very much. The performance of this is even 4 times faster than the C strftime implementation. So this is exactly what I need.

It’s also very good for me to learn that only having some string concatenation can be slower than Bytes. I didn’t think it would make this much of an impact here.

2 Likes

This is a nice demonstration of a general optimization principle: it is -amazing- how often “allocation avoidance” can speed up programs. Just amazing. And this holds across many different languages, but especially in GCed languages.

A useful way to profile programs, is to use a “clock” which is “bytes allocated”, and then perform the same analyses you would do if the clock were “seconds elapsed”. Finding allocation hotspots, in short.

4 Likes

Not exactly the question of the OP, but the following post may be of interest as well,

https://www.lexifi.com/blog/ocaml/note-about-performance-printf-and-format/

1 Like

Obligatory mention of pmpa : https://sympa.inria.fr/sympa/arc/caml-list/2011-08/msg00050.html

2 Likes

w00t! Thank you, ygrek! I don’t need it today, but … I’m sure I will sooner or later!