Retrieving the integer result of a Unix command inside OCaml code

On my Unix-based Mac OS, I have a command mdls -name kMDItemNumberOfPages -raw to compute the number of pages inside a pdf.
To get that integer inside an Ocaml program, I proceed as follows :

  1. I create a temporary file with Sys.command "touch temp.txt"
  2. Store the result in the temporary file with Sys.command "mdls -name kMDItemNumberOfPages -raw mypdf.pdf > temp.txt
  3. Using open_in and the Buffer module, read the contents of temp.txt, then call int_of_string on that content.
  4. Delete the temporary file with Sys.command "rm temp.txt"

This is perhaps overkill ? Perhaps one can do it using channels only and no temporary files ?

If you use the Unix library, you can directly read the stdout of the program via Unix.open_process_in.

If I understand correctly, let my_channel=Unix.open_process_in "some_cmd" ;; stores the result of some_cmd in a value my_channel of type in_channel. How do I retrieve the result stored in my_channel after that ?

Get the length of input_channel. (There a function for that in Stdlib) Then use

really_input_string in len |> int_of_string

You can’t, the length of a standard output is unknown.

It does not store the result in the in_channel, it gives you an in_channel from which you can read the program standard output using these functions.

I tried it this morning (yesterday I just read the linked manual sections) and surprisingly it doesn’t work at all in my case. My call to Unix.open_process_in not only raises a Unix.Unix_error(Unix.EINTR, "select", "") exception, but also makes my utop toplevel quit suddenly.

I’m using utop 2.2.0 and OCaml 4.07.0 if that matters.

UPDATE : your solution works fine if I use it outside utop, in a normal ocaml toplevel. I’m still interested in an utop-compatible solution, however.

UPDATE 2 : I found out that utop crashes if I call Unix.open_process_in directly (by doing let chan = Unix.open_process_in cmd;;, say), but not if I wrap the call in a let (...) in ; thus utop accepts the following snippet :

let reasonable_size = 100;;

let number_of_pages_in_pdf_file pdfname =
let cmd="mdls -name kMDItemNumberOfPages -raw "^pdfname in
let chan = Unix.open_process_in cmd in
let buf = Bytes.create reasonable_size in
let final_size = input chan buf 0 reasonable_size in
let _ = Unix.close_process_in chan in
int_of_string(Bytes.sub_string buf 0 final_size) ;;

number_of_pages_in_pdf_file β€œmy_pdf.pdf”;;

I’m not using utop so I can’t exactly tell you. But that’s precisely one of the reasons I don’t, its implementation fiddles with and embeds too much in your environment (notably lwt).

So I prefer to stick to ocaml in general (with rlwrap or down).

1 Like

This is my β€œscreenshot”. You may want to give the new version a try.

  β”‚ Welcome to utop version 2.6.0 (using OCaml version 4.08.1)! β”‚  
Findlib has been successfully loaded. Additional directives:
  #require "package";;      to load a package
  #list;;                   to list the available packages
  #camlp4o;;                to load camlp4 (standard syntax)
  #camlp4r;;                to load camlp4 (revised syntax)
  #predicates "p,q,...";;   to set these predicates
  Topfind.reset();;         to force that packages will be reloaded
  #thread;;                 to enable threads

Type #utop_help for help about using utop.

─( 11:02:58 )─< command 0 >─────────────────────────{ counter: 0 }─
utop # let cin= Unix.open_process_in "echo 20";;
val cin : in_channel = <abstr>
─( 11:02:59 )─< command 1 >─────────────────────────{ counter: 0 }─
utop # cin |> input_line |> String.trim |> int_of_string;;
- : int = 20
─( 11:03:03 )─< command 2 >─────────────────────────{ counter: 0 }─
utop # close_in cin;;
- : unit = ()
─( 11:03:05 )─< command 3 >─────────────────────────{ counter: 0 }─
utop # 
1 Like