How do you read the lines of a text file…

for years I have rewritten the same while loop that accumulates in a string list ref then catches End_of_file and returns List.rev of my accumulator (or a variant with with a tail rec function with the exception re-wraped as an option or more recently by matching on End_of_file).

For 4 years, I could simply have used Arg.read_arg "file.txt" most of the time and get the arrays of all the lines in file.txt with the end of line character stripped.

I realized yesterday morning this function existed (as well as Arg.write_file and two variants that read strings separated by \000 instead of newline) while searching for something in the documentation of the Arg module.

I hope this shaves a few bytes from everyone’s programs next time someone quickly wants to grab the content of a file (or save a list of strings as a file). And now I’m wondering what other gems I have missed…

13 Likes

I guess you mean Arg.write_fileArg.write_arg, but clever!

1 Like

Wait, Arg is a module in OCaml? WOW

Amazing. You learn something new every day.

@K_N thanks to @nojb’s work, these mundane tasks finally get bearable with the upcoming 4.14 Stdlib.

For example:

let lines file =
  let contents = In_channel.with_open_bin file In_channel.input_all in
  String.split_on_char '\n' contents

Or to support the widespread tool convention that filename - means stdin:

let lines file =
  let contents = match file with
  | "-" -> In_channel.input_all In_channel.stdin
  | file -> In_channel.with_open_bin file In_channel.input_all
  in
  String.split_on_char '\n' contents

And for writing:

let write_lines file lines =
  let output_lines oc = output_string oc (String.concat "\n" lines) in
  Out_channel.with_open_bin file output_lines

let write_lines file lines =
  let output_lines oc = output_string oc (String.concat "\n" lines) in
  match file with
  | "-" -> output_lines Out_channel.stdout
  | file -> Out_channel.with_open_bin file output_lines
6 Likes

I use input_line. Was there from day 1.

I’ve had utility functions like these in my “utils” toolbelt for … 25 years. It’ll be great to discard them. Huzzah!

Be careful, input_line is a footgun and has led to more than one bug out there – along with open_in and open_out defaulting to text mode and thus lying by default about your data.

input_line will never report an empty final line and performs newline translations if your channel is in text mode. This means you can’t expect to recover the exact file contents you just read by doing String.concat "\n" on the lines you input with input_line.

Also of course it doesn’t help with making sure you correctly close your channels and don’t leak them in case of exception. The new functions finally make that a no brainer.

2 Likes

it this correct to say that String.concat will make an in-memory copy of all the lines before writing to the file?

Yes of course, simply replace output_lines by

let rec output_lines oc = function
| [] -> ()
| l :: ls -> 
    output_string oc l; 
    if ls <> [] then (output_char oc '\n'; output_lines oc ls)

If that’s a problem for the amount of data you are dealing with.

2 Likes