A `fmt`-like library of combinators for printf/sprintf's `%t`?

I’m a giant fan of @dbuenzli 's fmt, and use it all over the place. But I’m aware that there’s some downside to it – it does rest on the Format module, and hence on some computation, even if one doesn’t use pretty-printing ops, and there are at least some combinators that use such ops.

So I’ve been thinking about trying to use printf and sprintf, and before I get too far down this road, I figured I’d ask if anybody knew of such a thing. If not, I guess I’ll be writing one (haha, that is to say, shamelessly copying from @dbuenzli).

Presumably you are worrying about logging ?

Note that there is a scheme to avoid this computation (definition). This is implemented in logs and for example here for a smaller self-contained implementation showing that it is easy to roll your own if you need to. See here for the discussion where @yallop introduced the trick and @drup profiled it.

Actually, no, I’m not worried about logging. I’ve been using fmt as a replacement for printf and (esp.) sprintf pretty much everywhere, b/c … well, you’ll have to rip those combinators outta my cold dead hands, maaaaan.

So for instance, I’m currently coding up a library to process Latex (inspired by GitHub - alvinwan/TexSoup: fault-tolerant Python3 package for searching, navigating, and modifying LaTeX documents ):

and this will involve doing a lot of printing – I parse the Latex into lexemes and them parse-trees[1], so I need to print them back out, and I’m using Fmt as the core structure for that. A critical requirement is that I can parse and then print back out, and it’s the identity function.

I have one concern, and one actual problem:

(a) concern: there’s all that processing happening in Format to … -format- the output, right? If there’s no boxes, no pp hints, then that processing is for naught: wouldn’t it be nice to avoid it?

(b) problem: I noticed that as I was printing something back out, a newline got inserted. I tracked it back to a combinator that inserted a break, and that got me to thinking: wouldn’t it be nice if there were no pp hints at all, so what got output was exactly what was specified in the printfs ?

[1] I’m aware that Latex barely has a lexical structure, and really it doesn’t have a syntactic structure. But for any particular set of command/environments, you can treat them as having grammatical structure, and all the rest as just raw strings, and it seems to allow to do some processing. Mostly (it seems) what you need to do is ensure that documents are “{}” and “” balanced, which I think is doable (or at least, is the case for the documents I’m trying to process).

P.S. All this started with two problems in the Latex document files:

(a) look for certain classes of environments and check whether they contain math (e.g. ‘$’)
(a’) then look for certain other classes and check whether they contain other chars (‘%’, ‘^’)
(b) look for certain commands “\of” and chek whether arguments contain math.

#a can be done with some clever perl (and non-greedy regexps). But #b is hard to do without at least doing parsing up to the point of parenthesis-matching, hence the impetus for this little toy project.

Eventually I’m going to want to perform rewrites on the tree, I don’t just want to parse, but to also print back out with as little change as possible.

Well that looks neither a task for Format nor for Printf.

I’ll admit that the biggest reason why I’m thinking of Printf is its resemblance to Fmt, and of course Fmt’s draw was the combinators. So once I start going down the road of “implement combinators for printf/sprintf”, it’s a small step to “implement combinators for bprintf”, and then to just "emit code that’ll call output_string directly on out_channels, or Buffer.add_string directly on a buffer.

I suspect I’ll get to that endpoint, but for now, format-strings are a useful structuring mechanism.

ETA: upon actually implementing, I realized that I didn’t want to use %t, but rather %a, just as in Fmt.