In the Pervasives module, we have an output_string which takes care of writing a string on an output channel, even if the string is huge.
On the other hand, there is no read_string:in_channel -> string function that reads all the remaining text in an input channel (assuming it corresponds to a text file).
Is this deliberate ? I would naively have thought that reading and writing are rather symmetrical operations.
It shouldn’t be hard to raise exceptions in those cases, should it ? It’s not very difficult to write a small function that does that, but I’m surprised that it’s not builtin.
I’m glad you asked, someone motivated could definitely
try and revive https://github.com/ocaml/ocaml/pull/640 which adds some
helpers to the stdlib’s IO. Good luck!
It is not too hard, because it is impossible to do “correctly”. The read operation on a file might block forever, e.g. a file on an NFS mount on a connection that went away.
I must point out that “compiler libs” (the set of modules that are part of the compiler implementation) give currently no guarantee about the stability of their API. You should only use it if you really need it and you know what you are doing.
So for a simple function such as string_of_file that can be implemented independently from the compiler codebase, you really should use any other library (containers, bos, base) instead of compiler libs.
let string_of_file fn =
let buff_size = 1024 in
let buff = Buffer.create buff_size in
let ic = open_in fn in
let line_buff = Bytes.create buff_size in
begin
let was_read = ref (input ic line_buff 0 buff_size) in
while !was_read <> 0 do
Buffer.add_subbytes buff line_buff 0 !was_read;
was_read := input ic line_buff 0 buff_size;
done;
close_in ic;
end;
Buffer.contents buff
Yep. I used to have one in my “utils” library. But recently I started using the “Bos” library for “convenient file I/O” and it’s got nice verbs for things like this – file contents, dir contents, etc.
I do think the symmetric operation to output_string is called input_line in the stdlib:
val input_line in_channel -> string
Read characters from the given input channel, until a
newline character is encountered. Return the string of
all characters read, without the newline character at the end.
Raise [End_of_file] if the end of the file is reached
at the beginning of line.
There’s something like that in all and every stdlib extension or alternative, including batteries and containers! Oftentimes it’s better to load a file in its entirety, so it won’t change (e.g. if a source file won’t fit in memory, a compiler will have trouble processing it!)