Hygienic macros for OCaml

Chet_Murthy · September 23, 2022, 5:59pm

I’ve been thinking about how to cleanly implement some form of hygienic macros for OCaml. Specifically, I’m thinking about how to implement something like Rust’s macro_rules. The problem there, is that the objects the macros manipulate are “token-trees” – streams of tokens, structured into trees at “()”, “”, “{}”. Now, it would make sense to stick to syntax that fits with OCaml’s PPX, so one would want there to be some form of PPX extension, that specified that the payload was a token-tree stream. Today, the payloads can be: structures, signatures, patterns, types. Strings/expressions enter as structures.

None of these work for hygienic macros implemented in the style of Rust: they require that the result of macro-expansion be re-parsed, and it can easily be the case that the syntactic category of a token-tree argument to a macro (is “a * b” a type, or an expression?) cannot be inferred until that final parse. So it would seem that one would need a new kind of payload for PPX extensions – a token-tree. Viz.

[%foo! (a, b,c)]

where the “!” indicates that the payload is a token-tree stream bounded by that closing “]”.

One might ask “why not use a string payload” as in

{%foo|(a,b,c)|} (which is shorthand, if I’m remembering the syntax right, for [%foo {|(a,b,c)|}]

and the answer is that in nested macro-invocations, we would want to rename any hygienically-created identifiers, but that will be difficult if they are embedded in string payloads. It would involve repeated parsing/renaming/stringification. A big mess, and hard to keep location information consistent (for debugging).

In Camlp5, I can sidestep all of this, b/c I can extend the syntax in arbitrary ways, but obviously if hygienic macros are going to be useful, they’ll need to be implemented with the current PPX infrastructure; hence my question to you all.

I’d welcome any feedback.

yallop · September 23, 2022, 8:46pm

You might be interested to know that we’ve recently restarted work on the modular macros work in Cambridge, with the aim of completing and formalising the design and bringing the implementation to an upstreamable state.

Modular macros are computations that hygienically construct code from typed fragments. and integrate smoothly with other OCaml features such as modules. The details are rather different from Rust’s macros, but perhaps they’ll support the use cases you’re interested in. There are some examples in the extended abstract, such as the typed printf function that builds a printer from a format description:

macro rec printk : type a b. (string expr -> b expr) -> (a, b) fmt -> a expr =
  fun k -> function
  | Int -> << fun s -> $(k <<string_of_int s>>) >>
  | Lit s -> k (lift_string s)
  | Cat (l, r) -> printk (fun x ->
                  printk (fun y -> k << $x ˆ $y >>) r) l

and some larger examples in @otini’s macros-examples repository, such as a port of the strymonas stream fusion library.

Chet_Murthy · September 24, 2022, 12:15am

Jeremy,

I think it’s worth looking at the Rust model for macro-writing, independently of hygiene (which is important – just, it’s not all that’s interesting about Rust’s model). I attach a function that one uses to format/write out a complex number. Here’s a use: write_complex!(f, "", "", self.re, self.im, T) (where f is a formatter object).

This is a lot of code, and complex code, and yet, you basically write it as if you’re just writing plain Rust. And this model works for a remarkable number of different kinds of macros. That’s powerful, and a model that I think might be valuable for OCaml.

Also, for the example you adduce (implementing printf), The way Rust does it, is to combine simple macros, and modular-impliciits. Again, remarkably powerful. Remarkably powerful.

macro_rules! write_complex {
    ($f:ident, $t:expr, $prefix:expr, $re:expr, $im:expr, $T:ident) => {{
        let abs_re = if $re < Zero::zero() {
            $T::zero() - $re.clone()
        } else {
            $re.clone()
        };
        let abs_im = if $im < Zero::zero() {
            $T::zero() - $im.clone()
        } else {
            $im.clone()
        };

        return if let Some(prec) = $f.precision() {
            fmt_re_im(
                $f,
                $re < $T::zero(),
                $im < $T::zero(),
                format_args!(concat!("{:.1$", $t, "}"), abs_re, prec),
                format_args!(concat!("{:.1$", $t, "}"), abs_im, prec),
            )
        } else {
            fmt_re_im(
                $f,
                $re < $T::zero(),
                $im < $T::zero(),
                format_args!(concat!("{:", $t, "}"), abs_re),
                format_args!(concat!("{:", $t, "}"), abs_im),
            )
        };

        fn fmt_re_im(
            f: &mut fmt::Formatter<'_>,
            re_neg: bool,
            im_neg: bool,
            real: fmt::Arguments<'_>,
            imag: fmt::Arguments<'_>,
        ) -> fmt::Result {
            let prefix = if f.alternate() { $prefix } else { "" };
            let sign = if re_neg {
                "-"
            } else if f.sign_plus() {
                "+"
            } else {
                ""
            };

            if im_neg {
                fmt_complex(
                    f,
                    format_args!(
                        "{}{pre}{re}-{pre}{im}i",
                        sign,
                        re = real,
                        im = imag,
                        pre = prefix
                    ),
                )
            } else {
                fmt_complex(
                    f,
                    format_args!(
                        "{}{pre}{re}+{pre}{im}i",
                        sign,
                        re = real,
                        im = imag,
                        pre = prefix
                    ),
                )
            }
        }

        #[cfg(feature = "std")]
        // Currently, we can only apply width using an intermediate `String` (and thus `std`)
        fn fmt_complex(f: &mut fmt::Formatter<'_>, complex: fmt::Arguments<'_>) -> fmt::Result {
            use std::string::ToString;
            if let Some(width) = f.width() {
                write!(f, "{0: >1$}", complex.to_string(), width)
            } else {
                write!(f, "{}", complex)
            }
        }

        #[cfg(not(feature = "std"))]
        fn fmt_complex(f: &mut fmt::Formatter<'_>, complex: fmt::Arguments<'_>) -> fmt::Result {
            write!(f, "{}", complex)
        }
    }};
}

ETA: in this case, the macro is pretty simple and doesn’t do any pattern-matching. But lots of macros do pattern-matching and deal with repeated arguments and such. It’s a lot like syntax-rules in Scheme (I think that’s the one – might be one of the other ones that has “…” for repeated arguments).

egoholic · September 26, 2022, 7:27am

Sorry for off top, but what does

hygienic

mean in OCaml/FP/language design context?

Chet_Murthy · September 26, 2022, 7:43am

It refers to the same idea from Scheme: Hygienic macro - Wikipedia

In short, if the expansion of a macro introduces bound variables, then those bound variables must not inadvertently capture free variables. A typical way of achieving this is that bound variables introduced in the macro’s expansion are always chosen to be fresh via some gensym-like method.

There’s more requirements, but really, that’s the most important one.

Hope this helps.

nrolland · October 9, 2022, 10:05am

This modular macros extension seem really powerful, is there any plan to merge it ?

yallop · October 10, 2022, 7:40am

Our aim is to develop it to the point where it can be merged, but whether it is actually merged will ultimately be a question for the OCaml development team. There’s still quite a bit of work to do before we reach that point.

Topic		Replies	Views
OPAM package: ocaml-monadic Community announce , ppx	3	1043	May 20, 2019
Getting ppx_deriving to work with with ocamlbuild Ecosystem build , ppx_deriving , ocamlbuild	3	1139	June 14, 2019
My Thoughts on OCaml vs Haskell/Rust in 2023 Ecosystem blog	91	23869	September 13, 2023
Derive-ocaml: a rust crate to help with ocaml FFI Community announce	4	2165	August 1, 2018
Ppx lib for OCaml 5.0 Ecosystem compiler , ppx , dune	4	979	March 23, 2022

Hygienic macros for OCaml

Related topics