Handling Strings That Don't Match a Specific Pattern

Hey everyone,

I’m currently working on a parser for reStructuredText as a preliminary step toward building a Sphinx-compatible tool using OCaml. One tricky part I’m facing is the handling of inline markup, particularly for bold text, like **bold text**.

The challenge lies in capturing text that doesn’t prematurely match the closing markup pattern—specifically, ** followed by a space, newline, or end-of-file. While it’s technically possible to model this using an equivalent regular expression, replicating that logic across multiple inline styles (like italics, literals, etc.) would make the codebase hard to maintain and error-prone.

Has anyone dealt with something similar in ocamllex or have tips on a cleaner approach for avoiding false-positive matches in these kinds of markup scenarios?

Thanks in advance!

It’s not me. Is that LLM ?

If not, you could use re instead of lex as it is allows shortest matching instead of longest and using functions to define regexes. There is a runtime cost though f course.