I went down a pretty deep rabbit hole here. This code definitely shouldn’t raise an exception, so I looked into the implementation of read_line
.
So, I’m looking for segfaults. There is a call to bytes_unsafe_to_string
at the end, but it looks fine because the function never touches the bytes after it calls that. Yet, if you modify the code to copy the string returned by read_line, the answer appears to be accepted, e.g.:
let count_words str =
let str = String.trim str in
if str = "" then 0 else (
let words = String.split_on_char ' ' str in
List.length words)
let () =
try
(let s = String.copy (read_line ()) in
print_int (count_words s))
with _ -> ()
Curiously, if I add a line before the main program to force a GC cycle:
let () = Gc.full_major ()
Then the crash doesn’t occur. Even checking the GC allocation policy appears to perturb something enough that the crash doesn’t occur:
let () = assert ((Gc.get ()).allocation_policy = 0)
I don’t really understand why this is the case. FWIW, the version of OCaml used by the website is 4.07.0, and it’s using the byte code compiler instead of native code. 4.07.0 has a known crash with the first-fit allocation policy, but that doesn’t appear to be in play (see the assertion added above).
I would try fuzzing, but I don’t think the OCaml fuzzing support works in bytecode mode, so I don’t know if it would matter much.
Perhaps it might be worth reaching out to the site admins to get more details about how the code is run, and perhaps which test case crashes. This certainly seems like a very strange segfault.