Yes, sedlex’ regular expressions are quite limited and don’t support submatching or any extension. ocamlex as submatching. Re also add several additional operators, but no generalized complementation/difference. Dreml (the paper listed above) has both submatching and complementation but doesn’t determinize.
The state of the art in regular expression engines is surprisingly incomplete. Nobody really figured out a proper way to have (longest) submatch, complementation and online determinization in a way that doesn’t explode with unicode char classes. To interested people in the room … 