I recently received this comment (thanks!) for a package I published. It is honestly something I never think of before.
str_1 ^ str_2 ^ str_3 ^ str_4 ^ str_5
This code seems innocent and clean. But if you look at ^
underlying implementation, you will know this is inefficient. ^
only handles two strings at a time. So 4 new String.t
/Bytes.t
objects are created during the process and the first string is copied 4 times. On contract, String.concat ""
can do everything in one run.
This comment led me back to re-evaluate all my string operations, and reimplement some of them. For example, I originally had
let align_left w c s =
let len = String.length s in
if len >= w then
s
else
s ^ String.make (w - len) c
A re-implementation was
let align_left w c s =
if len >= w then
s
else
let b = Bytes.create w in
Bytes.blit_string s 0 b 0 len;
Bytes.fill b len (w - len) c;
Bytes.to_string b
To my surprise, the re-implementation is slightly slower than the old one. So I rewrote the function using lower-level API.
external bytes_unsafe_fill : bytes -> int -> int -> char -> unit
= "caml_fill_bytes" [@@noalloc]
external bytes_unsafe_blit_string : string -> int -> bytes -> int -> int -> unit
= "caml_blit_string" [@@noalloc]
external bytes_unsafe_to_string : bytes -> string = "%bytes_to_string"
let align_left w c s =
if len >= w then
s
else
let b = Bytes.create w in
bytes_unsafe_blit_string s 0 b 0 len;
bytes_unsafe_fill b len (w - len) c;
bytes_unsafe_to_string b
This code is up to 100% faster the original one (according to my limited and unscientific tests)!
The questions are
-
It seems chaining
^
is a trap a lot of people will fall into. Is it possible to have compiler level optimization for this? When the complier sees a chain of^
operation, it will be automatically compiled toString.concat
or something else similar. Is there anything prevent us doing this? -
Why does the second implementation have worse performance than the first one?
-
Is it a good idea to use all the external functions as in the third implementation?
-
And… what is the best practice for performant string operation in general?
Thanks. Happy hacking!