How to run external command from OCaml script

I’m trying to replace a bunch of bash script with OCaml but I have difficulties getting the output of external commands.

Here is my simple script:

#!/usr/bin/env utop

open Unix;;

let print_chan channel =
  let rec loop () =
      let () = print_endline (input_line channel) in
      loop ()
    in
  try loop ()
  with End_of_file -> close_in channel;;

let () =
  let (ocaml_stdout, ocaml_stdin, ocaml_stderr) = Unix.open_process_args_full "echo" [| "echo"; "foo" |] (Unix.environment ()) in
  close_out ocaml_stdin;
  print_chan ocaml_stdout;
  print_chan ocaml_stderr;
  print_endline "terminado!";

I am using utop as the toplevel to have access to common modules such as Unix.
Unfortunately, my script does not print anything except “terminado!”. I’d like to get the output of echo foo.

I think you’re missing a call to Unix.close_process_full (cf https://caml.inria.fr/pub/docs/manual-ocaml/libref/Unix.html#VALclose_process_full ) to wait for the termination of the command.

Hello,
This is not directly related to your question, but for shell-style scripting, the combination of Sys.command + Filename.quote_command is a very robust and portable alternative to using Unix, and is simpler to use.

Cheers,
Nicolas

2 Likes

I tried doing so but I get a Exception: Sys_error "Bad file descriptor".

let () =
  let ((ocaml_stdout, ocaml_stdin, ocaml_stderr) as p) = Unix.open_process_args_full "echo" [| "echo"; "foo" |] (Unix.environment ()) in
  let _ = Unix.close_process_full p in
  close_out ocaml_stdin;
  print_chan ocaml_stdout;
  print_chan ocaml_stderr;
  print_endline "terminado!";

I would like to avoid creating temporary files for this use case. I don’t mind the heavier API (once I get it working).

Unix.close_process_full already takes care of closing the file descriptors, so you should not close them yourself.

Cheers,
Nicolas

I think it will work better if the first parameter (“command to run”) is a complete file path - “/bin/echo”.

That will get the example working. When you try to apply this as you intend, I’m guessing you will encounter new problems related to output buffering. You can address one of those on your end - where you write to the process, consider a flush at critical points, so the data actually becomes available to the process.

If this is simply a bulk data processing job, that may be all you need. If it’s intermittent data, then you have the same problem on the other end, where you will need access to the code to fix it - if the process output goes through C stdio or something like it, and doesn’t explicitly flush the buffer, it may not become available for a while.

The other pitfall here is that the pipe device itself is of limited size, and when it fills up, the process blocks. If you’re writing, you can’t read at the same time, so you fill up the pipe on your end, and you’re both blocked. And you have two outputs, stderr and stdin, so the process can block on one while you’re reading on the other.

You can address some of this with Unix.select, but bearing in mind that this operates on the underlying file descriptor and ignores buffered data - so there might be data that input could read, but select will say there’s nothing.

Temporary files are a very professional way to go.

1 Like

Even if I don’t close them myself I have this error:

 #!/usr/bin/env utop

open Unix;;

let print_chan channel =
  let rec loop () =
      let () = print_endline (input_line channel) in
      loop ()
    in
  try loop ()
  with End_of_file -> ();;

let () =
  let ((ocaml_stdout, ocaml_stdin, ocaml_stderr) as p) = Unix.open_process_args_full "echo" [| "echo"; "foo" |] (Unix.environment ()) in
  let _ = Unix.close_process_full p in
  print_chan ocaml_stdout;
  print_chan ocaml_stderr;
  print_endline "terminado!";
|> ./capture.ml
Exception: Sys_error "Bad file descriptor".

I guess once I close the process I can’t access the channels anymore.

Thanks! Specifying the whole binary path did fix the issue.
The difference between open_process and open_process_args was not very clear in my head, but since I want to run commands in a shell, I should use open_process.

Here is my working code in case it can help someone:

#!/usr/bin/env utop

open Unix;;

let print_chan channel =
  let rec loop () =
      let () = print_endline (input_line channel) in
      loop ()
    in
  try loop ()
  with End_of_file -> close_in channel;;

let () =
  let (ocaml_stdout, ocaml_stdin, ocaml_stderr) = Unix.open_process_full "echo foo" [||] in
  close_out ocaml_stdin;
  print_chan ocaml_stdout;
  print_chan ocaml_stderr;
  print_endline "terminado!";

As for using temporary files, I don’t think I need it in my use case but you made good points about the pipe size limitation. I’ll keep it in mind if I run into pipe issues.