Am I missing some comprehensive Ppxlib resource somewhere?

As far as I’m aware (mostly through being in the community, attending talks, etc…), ppxlib is being pushed as the standard library for writing ppxes.
All seemed good and fine until I started looking for some comprehensive developer docs or similar kind of resources on the library…
Something seems to be missing. There’s just so little info online on it.
There’s this extremely brief manual (archived link), and a couple of tutorials online which do a good job introducing, not detailing, the library.
I’m having a lot of friction trying to operate Ast_patterns compositions via trial and error. without much to read in the manual and in the (incomplete, as most of that module is generated) interface file…

Where do ppx authors over here look to get an understanding of this library?

EDIT: next to the selected solution, the resources from the first reply are very useful!

6 Likes

I’m using a combination of An introduction to OCaml PPX ecosystem | Tarides, PPX for plugin authors — ppxlib documentation and reading the AST here parsetree.mli.

Outside of that this issue Some info about Outreachy winter 2021/22 has good information. Plus looking at the implementation of things like ppx_deriving.

The community could use more documentation about this topic. Compared to Template Haskell we are quite short on blog posts and other docs.

6 Likes

The development process of ppxlib is basically that the ppxlib authors/maintainers also maintain all ppxlib-using extensions with respect to compatibility / ppxlib updates. If you want to write (open source) ppx extensions, you may as well directly get in touch with them – for example by opening issues on ppxlib with your questions, or asking them here on Discuss – because they will look at your code sooner or later anyway.

4 Likes

I think somebody (maybe me) should make a write up about first-class pattern matching on https://ocamlverse.github.io/ . @hyphenrf Could you share what exactly troubles you with first-class patterns?

4 Likes

@lambda_foo Thanks! There are more links here than what I’ve come across.

@Kakadu such document would be sorely needed.

I think I somewhat get how first-class patterns are meant to be operated, it’s just that the combination of obscure names with no docs (like pstr), the unclear relationships between abstract types, which being a synonym to which, and other obscured-by-abstraction implementation details like… which are the essential building blocks functions for custom behaviour? which are meant to work together? how am I supposed to encode behaviour A or behaviour B… etc…

As an example, I only learned of the fact that you can prohibit a payload by constructing a pattern like this:

Ast_pattern.(pstr nil)

… How did I learn that?? By doing a github code search haha. There’s no way I could’ve figured this out on my own except by pure chance and just following the types and hoping for something to click and give me the behaviour I want. pstr suggests the existence of structure aka module stuff so my first instinct was to ignore it because my ppx doesn’t touch the module language.

As for what I’ve been struggling with, I’ve been trying to figure out if predetermined multiple payloads are permitted. e.g. [%ext a b c], or “my extension can handle at most 3 payloads”, and then figure out that, if they are, how to make it so you can omit one of them to change behaviour. e.g. [%ext a b c] vs [%ext a b]. Promising candidates were alt_option, many, … and I even settled for encoding this behaviour as a pair vs triple, that is, delimiting a b c with commas… but all seem to operate in a slightly different way from each other. for example, single_expr_payload can take an expression but not an expression pair. single_expr_payload is defined in terms of pstr and pstr_eval, so I attempted to construct a pair_expr_payload, but I don’t know how to split the conjunction I’m getting from the pair function… stuff like that is a little hard to reverse-engineer from quite the abstract dsl I’m dealing with…

That’s somewhat unfortunate.
I do appreciate the amount of work put into the library of course, and it being offered to the community to improve extension point writing ergonomics, but with the current status quo (as you described it) it feels like the library is missing out on that potential and premise.

4 Likes

Thanks all for the discussion here so far!

@hyphenrf, you’ve already received links to the most important resources that could help you from @lambda_foo. I’ll just add some things about Ast_pattern since that’s what you seem most interested in.

There are two things you need to understand about Ast_pattern to get comfortable with it: its concept and the combinators it provides (with “combinators”, I’m referring to the functions that let you construct new ast_patterns from other ast_patterns that you already have). To get a feeling for what combinators Ast_pattern provides, it’s good to separate the Ast_pattern functions into three kinds: the special helper functions, which are the ones from alt to pack3; the “entry point” combinators pstr, psig, ptyp and ppat; and the combinators that are generated from the Parsetree module in the same way as the Ast_builder functions are generated, which are pretty much all the others.

Some background about Ast_builder to understand the last kind: for the various kinds of record fields and value constructors in the compiler Parsetree module, there are certain functions in Ast_builder that let you build the corresponding parsetree (aka AST) node. How that correspondence looks like is explained at the beginning of the docs of the Ast_builder module. Once you know how to use Ast_builder to construct a node of a certain AST type T, it will be easy to use Ast_pattern to construct an ast_pattern that matches a node of type T: if there’s an Ast_builder.Default function foo to construct a node of AST type T, then there’s also an Ast_pattern function foo to construct the corresponding ast_pattern that matches such nodes of type T. That’s explained at the beginning of the docs of the Ast_pattern module with an example. Do you have questions about that example?

That example also uses the entry point pstr. Unless your pattern is simple enough to use one of ppxlib’s helper functions directly (such as single_expr_payload), you will always need one of the four entry point functions pstr, psig, ptyp or ppat. Which of the four entry point functions you should choose, depends on what kind of payload you want the user to pass in to your extension node (or deriver): a structure, a signature, a core type, or a pattern (you have some info on that in one of the sections of the blog post @lambda_foo has pointed you to). The most common situation is having a structure as payload and hence using pstr. In that situation, you need to construct an ast_pattern, that matches values of type structure (I’ve explained how to do that via the generated functions in the last paragraph. apart from that, you can also use the special helper functions), and pass that ast_pattern to pstr.

To pick up your example about Ast_pattern.(pstr nil): One of the sections in the mentioned blog post explains Ast_pattern a bit. At the end of that section, it’s pointed out that Ast_pattern.(pstr nil) is the pattern you need to “prohibit a payload”. However, I think what you want to understand is why that’s the pattern you need. As explained, you need to construct an ast_pattern that matches values of type structure and pass that ast_pattern to pstr. Now, why is the ast_pattern you want to pass to pstr given by nil? That’s because structure is defined as a list of structure items. Since you don’t want any payload at all, you want a pattern matching the empty list. That’s what nil does.

All that being said, our documentation is really not complete enough and

Something seems to be missing. There’s just so little info online on it.

is definitely not the kind of user experience we want folks to have when using ppxlib. We have two things on our near-future to-do list that will hopefully improve that situation. One is making the different docs (info on blog posts, user manual, API docs) easier to find and available at the same place. And the other one is adding a section to the user manual on how to construct (Ast_builder and metaquot) and destruct (Ast_pattern and metaquot) AST nodes. @Kakadu, if you want to give that a go (or, if you prefer, only the part about Ast_pattern), that’s awesome!

Edit: Apart from getting a general understanding about how Ast_pattern works and understanding the example of Ast_pattern.(str nil), what you most seem to be struggling with are the special Ast_pattern helper functions (which, in my classification above, are the “first kind” of Ast_pattern functions). What I’ve tried to do with my answer was to help you (and others) understand how Ast_pattern works in general which should help getting a better grip on all of Ast_pattern, including the helper functions. I haven’t explained the helper functions the way I did with the other two kinds of functions (i.e. in a general way), since that doesn’t really work given that all of them have their own individual behavior. So what needs to be done for them is adding documentation to the API of each of those helper functions. I’ll take note of that for our todo-list :slight_smile:

9 Likes

About your suggestion, @gasche, that folks should ask questions on issues and discuss whenever lacking info on ppxlib, “because [we] (the ppxlib maintainers) will look at [their] code sooner or later anyway.”: I definitely agree with the suggestion of asking! That helps both folks to get answers to their questions and us to get a better overview over what kind of info is missing the most. About the reasoning that we’ll look at their code sooner or later anyways: I wouldn’t be that sure about that tbh. Most ppx rewriters use parts of the ppxlib API that will most likely stay stable, so for lots of rewriters, we’ll never look at their code individually.

2 Likes

First of all thank you for the in-depth reply <3

I did notice that in the beginning of the docs for Ast_pattern, it’s good to see reassurance that the two modules perfectly reflect each other… I remember taking a look at Ast_builder’s docs (the mli) and feeling like it won’t really help me much with Ast_pattern, perhaps some knowledge of how the OCaml parsetree itself is constructed is needed a priori?

perhaps they should be better emphasized in the docs then… Understanding the structure of ast patterns as you described it with the three kinds of functions makes it much easier to see through the abstraction and how it operates. This kind of valuable info doesn’t easily communicate from the example in the mli until someone (you in this case) takes it apart and explains how each part functions.

1 Like

Yes, some knowledge about that is needed. To get a general impression about the OCaml parsetree, it’s helpful to have a look at the compiler’s parsing/parsetree.mli file, which @lambda_foo has already given a link to. In there, you can see how the parsetree/AST is structured with comments pointing out the source code corresponding to each node.

To play around a bit and to find out how the node representing a concrete piece of source code looks like, I recommend using ast explorer, which is a tool to check out the AST of your source code across different languages, including OCaml.

perhaps they should be better emphasized in the docs then…

It’s not easy to add docs for them in their API, since they’re also generated. Given that my answer seems to have been helpful, I’ll try to wrap it up a little (hopefully sometime soon) to add it to the manual as a first step of adding a section about Ast_pattern. @Kakadu, let me know if you’re planning to write up something about Ast_pattern (which, again, would be awesome) to see if we can join efforts.

1 Like