[This is a post about two different-but-related things, so a little
schizophrenic perhaps] and somewhat stream-of-consciousness too.
It’s about a templating idea (for Printf
/Fmt
-like formatting, and
then how to add internationalization on top of that.
I’ve been writing code that does a lot of formatted text
(specifically, in the context of better error-messages), and thinking
about how that could be made easier-to-do. I use Fmt
a lot, b/c
even when not using Format
operations, it’s very compact and the
combinator-based approach is efficient on the brain. But it seems
like (with a bit of a front-end) it could be better. I was thinking
about the following code:
Fmt.(str "File %s: mixed short/medium/long-form attributes in str_item:\n short: %a\n medium: %a\n long: %a"
filename
(list string) used_short_form_attributes
(list string) used_medium_form_attributes
(list string) used_long_form_attributes
)
and how it separates the format-specifier (%a
), the formatter
((list string)
) and the actual value being formatted
(used_short_form_attributes
). And wondered: maybe it might be nicer
if it were
{%template|[id: mixed-attributes-in-str-item]
File $filename$:$%d:line$: mixed short/medium/long-form attributes in str_item:
short: $list string:used_short_form_attributes$
medium: $list string:used_short_form_attributes$
long: $list string:used_short_form_attributes$|}
The idea being, you specify each bit of ocaml value that’s going to be
formatted between $
:
-
If it’s a string, that’s it – just the expression (implied
%s
as
the format-specifier) -
If it’s anything other than a value that is going to be processed by
a formatter, you give the format-specifier and the expression,
e.g.$%d:line$
-
and if it’s going to be processed by a formatter,
%a
is implicit,
so you just provide the formatter and the expression, e.g.$list string:used_short_form_attributes$
.
[I hope it’s obvious this is easy-to-parse. Obviously $
is a
special char, as is :
.]
I haven’t written this, and heck, I don’t even know if it’s a good and
scalable idea. I searched the opam database and google, and found
nothing for this sort of application: there are templating languages
for HTML and such, but those are for much larger bits of text.
That’s the first idea.
And then, thinking about it, I realized that perhaps one could go
further and use the fact that a PPX rewriter is processing the
template, to hook into ocaml-gettext? I’ve never used it, but plan to
do so today to figure out how it works; if that doesn’t work, then
something like ocaml-gettext. That is to say, each template could have
an ID (as above, which isn’t actually printed) that gets used to index
into a PO file (or a PO-like file, e.g.
mixed-attributes-in-str-item:
en_US: """
File $#1$:$#2$: mixed short/medium/long-form attributes in str_item:
short: $#3$
medium: $#4$
long: $#5$
"""
fr_FR: """
File $#1$:$#2$: attributs mixtes de forme courte/moyenne/longue dans str_item:
longue: $list frenchquoted_string:#5$
moyenne: $list frenchquoted_string:#4$
courte: $list frenchquoted_string:#3$
"""
[please forgive my bad French, I used Google Translate]
(where frenchquoted_string
is a Fmt
formatter that surrounds its
argument with guillemets (“<<” and “>>”)). And then, the PPX rewriter
would take this PO file as an argument:
-
any message-id that didn’t appear in the file, would get added
-
any message-id that was -different- than the message in en_US
would get updated (only the en_US message in the PO file) -
the PPX rewrite process would be given a language (e.g.
fr_FR
)
and use that to select the message-texts from the PO file that it
would use in the template. So if you selectedfr_FR
, the text
inmixed-attributes-in-str-item
would be replaced with the
French text.
And in that replacement template, the order and formatters for each
expression could be changed:
-
#N
(e.g.#2
) would refer to the second expression in the
original template, so you could reorder -
$#N
means to use the same formatting instructions as in the
original template -
$%d:#N$
means to substitute%d
as the formatting instruction
(perhaps for different justification) -
$xxyy zz:#N$
means to use a different formatter
This pretty much requires that you do your internationalization at
build-time, since you’re replacing code, reordering arguments, etc.
I’m not sure if that’s a good idea or not, but it seems appealing. I
wrote earlier that I wondered if I could hook into ocaml-gettext, and
maybe this ability to reorder and change code/formatting breaks that,
but maybe it’d be worth limiting that, in order to be able to use
ocaml-gettext.
Anyway, OK, that’s the half-baked brain fart. I would really
appreciate anybody who had any comments on this.
ETA: I see that ppx_pyformat
comes close to what I want to do. I’ll have to look closely at it.