Format module from the Standard Library

Hello everyone. Does any of you know where I can find a decent tutorial for the OCaml Format module ? I cannot figure out how to use it, and the few things I have found are the documentation and the official tutorial, which are explicit but not clear at all. I have spent the last 30 minutes trying to understand it, in vain.

Thanks a lot for your answers.

4 Likes

I would recommend reading this blog post: https://cedeela.fr/format-all-the-data-structures.html

For understanding boxes, see https://ocaml.org/learn/tutorials/format.html

2 Likes

I’ll have a look at it. Thanks !

If that’s an option for you, I’d suggest using easy-format. There’s a complete example here with various ways to format the same thing.

Check out this paper, whose purpose is to explain and document the module.

I vividly remember how strange it looked the first time I tried using it. Other people recommended some resources to understand how to used it and how the boxing model works. I strongly recommend using the fmt library that abstracts a lot of complexity of the Format module.

2 Likes

I would actually recommend against easy-format. Fmt gives you a better API, but without adding layers upon layers of indirection and a direct compatibility with all the printers out there.

Format has a terrible API, but you don’t have to use it nowadays, Fmt and CCFormat are 100% compatible, including formatting strings, and with excellent APIs.

1 Like

I wonder what you use Format and descendants for?

I made easy-format around 2010 so I could pretty-print json and code in general, like a programmer would. Printing things like this using just Format was incredibly tricky:

{
  "field": {
    "number": 123,
    "text": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
  }
}

This requires a box with negative indentation for the } to be aligned with "field": { rather than with "text":. I don’t think it’s something the user should have to learn or understand, so it’s one of the styling options. The following style is also something that we don’t see often in popular json pretty-printers but it’s one of the built-in styles:

[
  [
    12345, 12345, 12345, 12345, 12345, 12345, 12345, 12345, 12345, 12345,
    12345, 12345, 12345, 12345, 12345, 12345, 12345
  ],
  [
    { "x": 0, "y": 100 },
    12345,
    12345,
    12345,
    12345,
    12345,
    12345,
    12345,
    12345,
    12345,
    12345,
    12345,
    12345
  ]
]

All the material within square brackets [...] in the example above use the same built-in style. You can try this by running ydump on arbitrary json; it’s provided by the yojson package.

It’s great to see initiatives to offer a libraries which work better in different contexts or overall.

I’m not sure what you mean by “layers upon layers of indirection”. Would you like to elaborate? (I haven’t used Fmt, which came later, and so I’m genuinely curious, not just sad because I’m the author of easy-format)

No it doesn’t, Just close the box before closing the parens:

# Fmt.pr 
     {|@[<v2>"%s": {@ %a@]@ }|} 
     "foo" 
     Fmt.(list ~sep:cut string) [ "aaa"; "bbb" ; "ccc" ];;
"foo": {
  aaa
  bbb
  ccc
}

You almost never need negative indentation with format. If you do, it’s more likely than your boxes are in the wrong places.

Honestly, there is nothing really easy about easy-format. The API is not particularly intuitive, the layout specification is not particularly easier to use and it looses all the free-form combinators from Format.

Not a tutorial, but you might find it helpful to see some examples of Format usage.

Here’s a pretty-printer for a small subset of XML:

A pretty-printer for a subset of JavaScript:

A pretty-printer for a subset of Python:

Personally, I like the Format module very much and never had to reach for another library, but I remember that it was tricky to wrap my head around when I just started to use it.

1 Like

Maybe these examples should be added to the manual of Format?

To me, the best tip for using Format is to avoid the functions and use the printf symbols instead. It just tends to work out much better that way.

Well, the point here is to print "foo": on the left-hand-side independently from the right-hand-side, which may or may not involve curly braces.

Thanks for your input. I’m sure it’s not easy to get started with it but readability was the primary concern. For reference, people can look at the source code of the json pretty-printer that comes with yojson. I haven’t touched this code—or any proper pretty-printer for that matter—in 8 years and I still find this somewhat readable.

@mjambon regarding your JSON example, it is possible to format each key and value independently and without negative indentation. Here’s an example:

let fprintf, printf, list = Format.(fprintf, printf, pp_print_list)

let rec format_json f = function
  | `Number n ->
      fprintf f "%d" n
  | `Object pairs ->
      fprintf f "{@;<0 2>@[<v 0>%a@]@,}"
        (list ~pp_sep:(fun f () -> fprintf f ",@ ") format_key_value) pairs

and format_key_value f (key, value) =
  fprintf f "%s: %a" key format_json value

let example =
  `Object [
    "field", `Object [
      "number", `Number 123;
      "text", `Number 10000000000000000;
    ];
  ]

let () =
  printf "@[<v 0>%a@]" format_json example

Output:

{
  field: {
    number: 123,
    text: 10000000000000000
  }
}

The trick is that the zero-indentation box is inside the curly braces:

image

However, I would say, the solution is non-obvious.

2 Likes

My one problem with Format, is that you can’t, for example, add a trailing comma in an list, depending whether the layout is horizontal or vertical. For example, I want this horizontal layout:

[foo, bar, baz]

Collapse into the following vertical layout:

[ 
  foo,
  bar,
  baz,
]

Notice the last trailing comma.

Suggestions welcome.

I remember thinking about adding a function pp_print_string_or_nl : Format.formatter -> t -> unit which would act more or less like pp_print_break and either print the given string, or split the line. I remember being unsatisifed with the API (and the fact that your use case doesn’t exactly fit prove that it was not good enough). If you come up with a good API, this should not be difficult to implement.

How about this:

val pp_print_explicit_break : formatter -> no_break:string -> break:(string * string) -> unit

Which is similar to pp_print_break, but takes strings instead of integers? The tuple corresponds to text before and after break.

pp_print_explicit_break ppf ~no_break:"" ~break:("", " ") corresponds to pp_print_break ppf 0 4,
pp_print_explicit_break ppf ~no_break:" " ~break:("", "") corresponds to pp_print_break ppf 1 0,
while an optional trailing comma could be denoted as:

pp_print_explicit_break ppf ~no_break:"" ~break:(",", "")

Maybe it could have a format string shortcut, like pp_print_break does. For example @;<"" "" " "> for @;<0 4>.

Not sure. This does not strike me as particularly elegant.

That’s certainly seems sufficiently generic, and I think it’s the best you can provide given Format's layout engine. It’s indeed not fantastically elegant, but not all functions are beautifull programming perls. :slight_smile:

I’m not sure a syntax is really needed.

If you want to take a shot at it, I’m fairly certain this would be easy to add (just by looking at break and generalizing it). You can then express the other break/cut functions in term of it.

2 Likes

@keleshev I definitely forgot, but the original conclusion that a negative indentation was needed may have come from the following.

For json objects, we should have a style that outputs either this (a):

{
  field: {number: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", text: "a"}
}

or this (b):

{
  field: {
    number: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
    text: "aaaaaaaaaaaaaaaaaa"
  }
}

but not this (c):

{
  field: {
    number: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", text: "a"
  }
}

These outputs were produced by modifying your original code, using an hv box for the list of fields instead of just v, and changing the leaf values to strings of different lengths. Editing the aaaa strings for case (c) gives:

let fprintf, printf, list = Format.(fprintf, printf, pp_print_list)

let rec format_json f = function
  | `String s ->
      fprintf f "%S" s
  | `Object pairs ->
      fprintf f "{@;<0 2>@[<hv 0>%a@]@,}"
        (list ~pp_sep:(fun f () -> fprintf f ",@ ") format_key_value) pairs

and format_key_value f (key, value) =
  fprintf f "%s: %a" key format_json value

let example =
  `Object [
    "field", `Object [
      "number", `String "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
      "text", `String "a";
    ];
  ]

let () =
  printf "@[<v 0>%a@]" format_json example
1 Like

FYI, there’s now a PR for the “pp_print_explicit_break” feature: https://github.com/ocaml/ocaml/pull/2002

Feedback is welcome.