This is the first release of
ppx_regexp, a ppx extension which turns OCaml patterns into invocations to the Re library. It supports named bindings of either
string option type depending on whether the capture is guaranteed or not. The ppx also serves to lift the compilation of the regular out of abstractions and functors and up to the file-level module.
This extension should be useful as a data-mining tool, due to it’s terse syntax and the efficiency of the Re library on which it is based. I have previously used
mikmatch with great success, and I wanted a ppx-based replacement. Fortunately the needed ingredients were now already available in the Re library (with Re.Mark) and the ppx framework.
Note that there are also some limitations:
- Pattern guards are not supported. Since the full
functionis compiled into a single regular expression, at the point where the
Re.execfunction finds a match and terminates, we can no longer backtrack in case the guard condition fails.
- No exhaustiveness checks are done. Instead the rewriter will issue a warning if no match-all case is provided. Exhaustiveness checks for regular expressions seems like an interesting problem, but not one that I will look into myself at this point.
There is also a combinator-based interface to Re, tyre. The transformation done by
ppx_regexp is very similar to what tyre does, so I’d expect about the same performance.