[ANN] ppx_format

I happy to announce the first release of ppx_format.

Its a small ppx rewriter that was first written at the mirage retreat in 24 with @PizieDust , and that allows to put values in the middle of format strings:

let s = "World"
let x = 123
let () = Format.printf {%i|Hello {%s s} {%a Format.pp_print_char % Char.chr 65} {%d x}%!|}

Its compatible with any function that takes format strings. The only constraint is that the format string has to be the last argument.

I have used at in some of my projects, and it will be available on opam as soon as the release PR is merged.

8 Likes

Would it be possible to make the syntax like this: Format.printf {|Hello %s{s}|}?

1 Like

Fun coincidence: at the last Mirage retreat I discussed format interpolation with @octachron, and I wrote the beginning of a RFC for it (and then promptly forgot about it). I just pushed it at …/format-interpolation.md.

2 Likes

You might consider a different interpolation syntax (viz. GitHub - camlp5/pa_ppx_fmtformat: A PPX rewriter to provide string-interpolation, using Fmt as the underlying mechanism )

The advantage of the suggested format for interpolated expressions over “%{…%}” is that with the multiple forms, there’s never a need to escape – you can just pick a different form.

====

The simplest interpolated expression is of the form $(...) but all of the following are accepted:

  • $(...), $(|...|)
  • $[...], $[|...|]
  • ${...}, ${|...|}
  • $<...>, $<|...|>

So basically, ‘$’ followed by any of [ ‘(’, ‘[’, ‘{’, ‘<’ ],
optionally ‘|’, and then at the end, the matching text. Between these
8 forms, it should be possible to enclose any interpolated expression
without difficulty, I would think.

In the text surrounded by these delimiter, anything other than the
end-string is acceptable, and there is no provision made for escaping.

The contents of the interpolated expression can be of three forms:

==== interpolated expression with format-specifier: $( <expression> | <format-specifier> )

an interpolated expression of the form $(abc|%d) specifies that the
expression abc will be formatted with %d. So {%fmt_str|a $(abc|%d)|} expands to
Fmt.(str "a %d" abc).

==== interpolated expression with Fmt formatter: $( <expression> | <Fmt formatter expression> )

an interpolated expression of the form $(abc|int) specifies that the
expression abc will be formatted with the Fmt formatter int. So {%fmt_str|a $(abc|int)|} expands to
Fmt.(str "a %a" int abc).

==== interpolated expression without specifier/formatter: $( <expression> )

an interpolated expression of the form $(abc) specifies that the
expression abc will be formatted with %s. So {%fmt_str|a $(abc)|} expands to
Fmt.(str "a %s" abc).

1 Like

Implementation wise it should be pretty easy. However I use it quite a bit in other project so I would prefer to not change the syntax. I think if you really want it this it should be easy to maintain a fork

Interesting indeed. On the implementation side, my version basically has a second lexer.

I dont have provision for escaping either. My idea is that if you want something like “{%` in your expression, you can always bind that expression to a variable and use the variable. I don’t think its very nice to have very complex expressions inside the format string.

In my opinion, using {% in the format string is more of an issue than using it in the expressions, and its not solved by your proposed syntax.

Also the idea to have special syntax for %a that does not have %a in it might be good, but my goal with this ppx was to provide maximum familiarity: someone used by printf/format can read code written with my ppx and immediately guess what is happening correctly.

Emile,

First, your rationale is excellent, and I understand your reasoning. I cannot fault it. I would add, though, that when one can start using the equivalent of [from Perl5]

"abc $foo def $bar"

in OCaml, you start using more and more of the stuff, assembling more and more complicated strings containing interpolated variables. And then you get to where you want an interpolated -expression- …

"abc ${\( f($foo) )} def $bar"

and you’re off to the races. That idiom ${\( …)} is a “scalar context” within a string – you can put any scalar expression in there. And at that point, you can write rather complex structural expressions, that are all one big printf.

It was when I realized I could do this in Perl5 back in 1995, that I became a Perl bigot.

All of this is way to try to express that while surely Printf-style format-strings do not encourage complexity and depth in the expressions one writes, when you can start doing interpolation, and esp. interpolation with -expressions-, at least some people will find it irressistible.

And I do think that it’s more readable to do things this way, than to use loops/iterators and such to write the same thing without interpolation.

All that said, for sure I understand your position, and given your design goal, I can’t dispute your decision.

1 Like

I think thats probably quite good, but in OCaml, you get degraded tooling inside the format string: merlin is spotty, there is no ocamlformat, no syntax highlighting. So in my opinion its better to outsource complex logic to a variable above. If such expressions-in-strings were a first-class citizen of language, like I am guessing they are in perl, it would be quite different.