Performance of string comparison

Here’s the string comparison function in runtime/str.c.

CAMLprim value caml_string_compare(value s1, value s2)
{
  mlsize_t len1, len2;
  int res;

  if (s1 == s2) return Val_int(0);
  len1 = caml_string_length(s1);
  len2 = caml_string_length(s2);
  res = memcmp(String_val(s1), String_val(s2), len1 <= len2 ? len1 : len2);
  if (res < 0) return Val_int(-1);
  if (res > 0) return Val_int(1);
  if (len1 < len2) return Val_int(-1);
  if (len1 > len2) return Val_int(1);
  return Val_int(0);
}

Why is the comparison of the two lengths done after the memcmp call instead of beforehand? How does the performance of the two steps compare?

1 Like

Maybe because

String.compare "b" "aa"

should return 1.

Edit: My mistake, you mean alphanumeric sort ordering! So an equality check could compare lengths first, but sorting obviously requires peeking.

@bltxd Yeah I must have had equality on my mind when I read that function.

@VPhantom you only need to compare lengths if the strings are equal in content up to the smaller one’s length, since the comparison is lexicographical.

1 Like