When does `Printf.printf` flush?

I understand I can force flushing to occur by suffixing my string with the special format arg %!

Printf.printf "--> LOG: [%s]\n%!" some_var

If I understand things correctly, this is equivalent to:

Printf.printf "--> LOG: [%s]" some_var; print_newline ()

I’d like to understand more when flushing is expecting to happen using Printf.printf because I got a little surprised seeing something outputted way later than expected in my program.

I tried to find the answer by reading the source but the Printf module is fairly involved, I’ll probably make a mistake trying to analyze it.

I don’t know the answer, but after long experience with printf/fprintf (even on stderr) I concluded that the only safe rule is “it only flushes when you use %!” and so, when I expect something to be flushed, I make sure to indicate that. This is in marked contrast to C stdio, but hey, at this point, 30+ years after caml-light started, there’s no point in changing it.

A few remarks:

  • The %! annotation is equivalent to flush stdout when appearing in the argument of Printf.printf.

  • The flushing behaviour of Printf.printf has nothing to do with Printf itself; it is the behaviour of the underlying out_channel that is at play (in the case of Printf.printf the underlying output channels is the standard output, stdout).

  • Output channels (out_channel) implement a standard buffering mechanism for efficiency, and the data written to them will be sent to the output device whenever the buffer is “flushed”. This typically occurs when it becomes full, or if flush is called. The default buffer size for channels in OCaml is 65536 bytes, and it is not possible to change it (easily).

Cheers,
Nicolas

2 Likes

Following up on Nicolas’ comment, we see ultimately the logic ends up here: ocaml/io.c at 23dab79a4e42856aa33816b9c79c3d4d79959cb9 · ocaml/ocaml · GitHub

/* Output data */

#define Putch(channel, ch) do{                                            \
  if ((channel)->curr >= (channel)->end) caml_flush_partial(channel);     \
  *((channel)->curr)++ = (ch);                                            \
}while(0)
1 Like

[caveat: again, this behaviour has been there as long as I can remember, and at this point, I’m not sure it’s worth fixing]

I think the discrepancy is the following: in libc(3), when using stdin/stdout/stderr,

  1. if you write to stderr, it is unbuffered by default.

  2. if you write to stdout, it is buffered by default

  3. but if you then read from stdin, stdout gets flushed.

  4. and if you write a newline to stdout, that gets flushed.

Again, those are the defaults, and it’s what every C programmer has come to expect. If one programs in OCaml long enough, one simply adjusts to the different set of behaviours of OCaml streams which are:

  1. all streams are buffered

  2. and no flushing happens at newlines, nor when input is read

  3. so always do your own flushing

I just went thru the process of rechecking these behaviours with both ocaml and libc on my Linux box, so I think what I wrote above is accurate.

1 Like

Making stderr unbuffered by default was discussed upstream not so long ago.

Nice feedback thanks. That looks like a big buffer, I’ll have to remember to flush regularly then.

In my case, I was:

  1. printing a line to stdout
  2. then printing many lines to stdout indirectly by calling a C lib function: observing output
  3. terminating my program, then I would see output from step 1

Does stdout always get flushed by the OCaml runtime on program exit 0?

Thanks for that useful recap, very useful.

Interesting. It’s unclear to me if I should probably run:

Out_channel.set_buffered stderr false

at the start of my program though.

Or potentially loose some buffered content in case of an unexpected crash?

Yes.

I don’t recommend it in general (unbuffered writes are rather inefficient). I tend to flush explicitly if needed in a specific context.

Note that channels should be flushed on exit even in case of an uncought exceptions (or when calling exit), so a priori this is only an issue if there is a bug in unsafe code (eg C bindings) or if there is a bug in the compiler.

Cheers,
Nicolas

Maybe not, but if you are using channels for logging you likely want each log statement to have flushed the output channel you log on when it returns. For example Logs’s default reporter do that.

It’s just a matter of defining your own log function and then use that consistently. For example here on stderr but could be on any channel with more setup:

let log fmt = Format.eprintf ("@[" ^^ fmt ^^ "@]@.");;

Great, that’s good to know.

Thanks for the tip! I didn’t know about the ^^ operator.

Since I’m exploring OCaml and my programs are rather small and don’t output much, I’ll keep it simple and use a utility function. This is what I want 99% of the time.

let println fmt = Printf.printf (fmt ^^ "\n%!")

(* prints "DEBUG (1, 2)" *)
let () = println "%s (%d, %d)" "DEBUG" 1 2

The Format module looks quite interesting though :slight_smile:

Can I put in a plug for the fmt package? It’s … better than sliced bread. Truly.

I hope the better sliced bread can simply be replaced by a good runtime type representation in the future :–) I always feel silly writing M.pp values for my types with fmt when it’s obvious the code structure of the formatter could be used for many other encoding tasks.

1 Like

Thanks for letting me know, I’ll try it for sure once my printing needs become more complicated :slightly_smiling_face:

From what I understand OCaml’s channel have their own buffer distinct from say the one of FILE *. So I always have the habit of flushing from the OCaml side before calling C functions that can call printf and similar (and likewise flush from C before returning to OCaml).

1 Like