I recently received this comment (thanks!) for a package I published. It is honestly something I never think of before.
str_1 ^ str_2 ^ str_3 ^ str_4 ^ str_5
This code seems innocent and clean. But if you look at
^ underlying implementation, you will know this is inefficient.
^ only handles two strings at a time. So 4 new
Bytes.t objects are created during the process and the first string is copied 4 times. On contract,
String.concat "" can do everything in one run.
This comment led me back to re-evaluate all my string operations, and reimplement some of them. For example, I originally had
let align_left w c s = let len = String.length s in if len >= w then s else s ^ String.make (w - len) c
A re-implementation was
let align_left w c s = if len >= w then s else let b = Bytes.create w in Bytes.blit_string s 0 b 0 len; Bytes.fill b len (w - len) c; Bytes.to_string b
To my surprise, the re-implementation is slightly slower than the old one. So I rewrote the function using lower-level API.
external bytes_unsafe_fill : bytes -> int -> int -> char -> unit = "caml_fill_bytes" [@@noalloc] external bytes_unsafe_blit_string : string -> int -> bytes -> int -> int -> unit = "caml_blit_string" [@@noalloc] external bytes_unsafe_to_string : bytes -> string = "%bytes_to_string" let align_left w c s = if len >= w then s else let b = Bytes.create w in bytes_unsafe_blit_string s 0 b 0 len; bytes_unsafe_fill b len (w - len) c; bytes_unsafe_to_string b
This code is up to 100% faster the original one (according to my limited and unscientific tests)!
The questions are
It seems chaining
^is a trap a lot of people will fall into. Is it possible to have compiler level optimization for this? When the complier sees a chain of
^operation, it will be automatically compiled to
String.concator something else similar. Is there anything prevent us doing this?
Why does the second implementation have worse performance than the first one?
Is it a good idea to use all the external functions as in the third implementation?
And… what is the best practice for performant string operation in general?
Thanks. Happy hacking!