Pretty-printing with Format

In the minimal example below, I define a type t, and then print its values using the function pp. That function has redundant code, which I am trying to factor out (because in my real usecase, there is much more redundant code):

  1. The first solution pp_sequential seems to work but sequentiality and splitting the opening and closing of the box look strange (but maybe aren’t?).
  2. The second solution pp_naive is type-incorrect but shows my intent.
  3. The third solution pp_assumes looks the most natural to me, but assumes the existence of a function fmt_printf which returns a format. Is such a function even definable?
  4. The fourth and last solution pp_k is incorrect since the code in the match statement cannot know the correct indentation (also: I was also expecting to use kfprintf but cannot see how).

What would be the “correct” (idiomatic, generic, efficient) way to code this in OCaml?

type t = A of int list | B of float list
let ppints = Format.(pp_print_list pp_print_int)
let ppfloats = Format.(pp_print_list pp_print_float)

let pp ppf = function
  | A ints -> Format.fprintf ppf "List: @[<v>%a@]@." ppints ints
  | B floats -> Format.fprintf ppf "List: @[<v>%a@]@." ppfloats floats

let pp_sequential ppf t =
  Format.fprintf ppf "List: @[<v>";
  (match t with
  | A ints -> Format.fprintf ppf "%a" ppints ints
  | B floats -> Format.fprintf ppf "%a" ppfloats floats);
  Format.fprintf ppf "@]@."

let pp_naive ppf t =
  (* Type-incorrect. *)
  Format.fprintf ppf "List: @[<v>%a@]@."
    (match t with
    | A ints -> ppints ints
    | B floats -> ppfloats floats)
      
let pp_assumes ppf t =
  (* This assumes there is a printing function returning a format. *)
  let fmt =
    match t with
    | A ints -> Format.fmt_printf "%a" ppints ints
    | B floats -> Format.fmt_printf "%a" ppfloats floats
  in
    Format.fprintf ppf ("List: @[<v>" ^^ fmt ^^ "@]@.")

let pp_k ppf t =
  (* Incorrect since the code in "match" cannot know indentation. *)
  let k s = Format.fprintf ppf "List: @[<v>%s@]@." s in
    match t with
    | A ints -> Format.kasprintf k "%a" ppints ints
    | B floats -> Format.kasprintf k "%a" ppfloats floats

Thanks!

In short: pp_sequential is idiomatic OCaml and an OK solution, except that you can avoid calling fprintf and instead call ppints and ppfloats directly:

let pp_sequential ppf t =
  Format.fprintf ppf "List: @[<v>";
  (match t with
   | A ints -> ppints ppf ints
   | B floats -> ppfloats ppf floats);
  Format.fprintf ppf "@]@."

You can also do it using a local function (close to your pp_naive):

let pp_naive ppf t =
  Format.fprintf ppf "List: @[<v>%t@]@."
    (fun ppf ->
     match t with
     | A ints -> ppints ppf ints
     | B floats -> ppfloats ppf floats)

(note the use of %t instead of %a to avoid having to pass t as an argument).

Cheers,
Nicolas

2 Likes

I wish there was a cookbook for this sort of advanced use of format strings (maybe there is? the actual documentation and the tutorial linked in it are almost exclusively about the boxes etc.). I recently puzzled for a long time over how to write a function that accepts formats based on another function that accepts formats. I was looking at the following function from ppxlib:

val error_extensionf :
  loc:t -> ('a, Format.formatter, unit, extension) format4 -> 'a

which is written as follows:

let error_extensionf ~loc fmt =
  Format.kasprintf
    (fun str -> Error.to_extension @@ Error.make ~loc ~sub:[] str)
    fmt

This use of kasprintf lets you use error_extensionf with arbitrary format strings. This is nice, but I really struggled when I tried to wrap the function further while preserving the ability to use format strings. I initially assumed you had to use kasprintf or kfprintf to do so, on the reasoning that it’s the functions you use to play with the last parameter of the format type. But as far as I can tell you can only use kasprintf to get an actual string, and you can’t use kfprintf in this scenario because you don’t have a formatter. What you can do is use kdprintf, as follows:

let wrapper ~loc fmt =
  Format.kdprintf (fun ppf -> do_thing (error_extensionf ~loc "%t" ppf)) fmt

I still don’t quite get why this is the solution. I figured it out by asking myself “what well-typed expression can I write in this setting”, but the documentation of dprintf is about pretty-printing and does not suggest at all that it is useful to get more general types, and it feels weird (and probably inefficient?) to have to write a new, trivial format string (I mean "%t") for a simple map of the result. Of course it’s very possible I missed something.

I’m bringing this up because like the first question, this is some rather basic usage of format strings that is hard to figure out just reading the doc. Is there somewhere where the use of the Format.*printf functions and the meaning of the various types used by Format is explained in more detail? Like what do the parameters of format4 mean (it’s in the doc of Stdlib but without examples), how to understand the types of the k*printf functions’ continuation arguments, how do you typically obtain or use a formatter -> 'a or a formatter -> 'a -> unit or whatever, all the things %t can be used for, etc. I think I sort of understand all of this now but it was rather hard.

1 Like

You may be interested in reading Gagallium : The 6 parameters of (’a, ’b, ’c, ’d, ’e, ’f) format6 (it talks about format6, but format and format4 are just special cases).

Cheers,
Nicolas

3 Likes

Thank you, this is a nice explanation, much more complete than the description in the stdlib doc. It doesn’t really go into the practical aspects though.

Concerning your question about combining format string based function, one alternative way would be to define a continuation version of error_extensionf as:

let error_extensionkf ~loc k  fmt =
  Format.kasprintf
    (fun str ->k @@ Error.to_extension @@ Error.make ~loc ~sub:[] str)
    fmt

Then you wrapper function is just choosing the right continuation

let wrapper ~loc fmt = error_extensionkf ~loc do_thing fmt

Without this continuation function, your use of dprintf is quite reasonable here: one use of Format.kdprintf fmt is to create a %t printer from an format string based function. Conversely, we can define:

let to_printer_function f fmt =f "%t" pr

to create %t based function from a format string function. Thus your wrapper function can be read as:

let wrapper  ~loc fmt =
  let error_of_printer = to_printer_function error_extensionf in
  Format.kdprintf (fun printer -> do_thing (error_of_printer ~loc printer)) fmt

In other words, your function is using Format.kdprintf to gather the arguments of the format string as a %t printer, then using this printer in the %t version of error_extensionf which could have been defined directly as

let string_of_printer printer =
  let b = Buffer.create 10 in
  let ppf = Format.formatter_of_bufffer b in
  printer ppf;
  Format.pp_print_flush ppf;
  Buffer.of_contents b

let error_of_printer ~loc printer =
  Error.to_extension (Error.make ~loc ~sub:[] @@ string_of_printer printer)

At some point, I should try to find the time to write more examples of this form. However, nowadays in fact I tend to prefer the approach proposed in New `Format` and `Printf` `printf`-like functions that accept a heterogeneous list as arguments by zazedd · Pull Request #13372 · ocaml/ocaml · GitHub which replaces the need of using continuation argument by an heterogeneous list of arguments. With this approach, it is possible to define error_extensionf in an extensible way as:

let error_extensionf ~loc fmt args =
  let str = Format.lasprintf fmt args in
  Error.to_extension @@ Error.make ~loc ~sub:[] str

let wrapper ~loc fmt args =
 let err = error_extensionf ~loc fmt args in
 do_thing err
3 Likes

This is helpful, thanks! The heterogeneous list version looks a lot nicer to use indeed, the code you get for my use case is a lot more natural and I wouldn’t have had difficulty coming up with it.

I’ve spent some time looking at the code of Format wondering how to do what I initially thought should be possible, namely a function that composes with a printf-style function without using an extra format string like "%t", or ideally a generic version of that. I think I now get why it doesn’t work. If you apply extension_errorf to a format string, it forces the last parameter of the format to extension, and for a particular application the first parameter will be some function type that ends in extension, call it extension_func. But when you apply wrapper to a format string, the last parameter will be foo (the return type of do_thing) and the first one will be foo_func. foo_func and extension_func are related in a systematic way but there is no simple way to express it in order to convert your format or to write the type of your make_wrapper generic function.

I presume a solution would require implementing it at a lower level, maybe add in the actual format type some constructors for continuations with yet another parameter. I am not sure if it is even possible. Of course the simpler solution is tell people to always have a continuation version when they write a library function that accepts formats.

By the way, while reading up on heterogeneous lists, I found this blog post which I think also helps understand the format type as it contains a toy implementation of it (since format is a heterogeneous list under the hood, sort of).

1 Like