Um, that regexp will match
/** foo bar */
....
...
/** goo boo */
...
by going from “foo” all the way thru to “boo”, no? You want a slightly more-complex regexp in the middle (instead of _*
)
Um, that regexp will match
/** foo bar */
....
...
/** goo boo */
...
by going from “foo” all the way thru to “boo”, no? You want a slightly more-complex regexp in the middle (instead of _*
)
Hey Chet, yeah, those are all valid docblocks too, but should return empty list. Other valid examples:
/**
* Mo mo mo, some info
* @return void
*/
or
/** bla
* @param array<string, int> $ar
more bla bla */
No, I mean that you’ll end up grabbing both docblocks, and all the code in-between, won’t you ? Lex will look for the longest-match, right?
Ah crap, you’re right. Thanks, will fix.
Instead of _*
, maybe you want something like:
( [^ '*'] | '*'+ [^ '/' '*'] )*
[I’m doing this on-the-fly, so I could be making a mistake here]
The idea is, you want the complement of the language of “*/”.
At least, I think that’s how it works – been so long I don’t quite remember anymore.
I’ll check some docs if I can match the shortest possible string instead.
Oh wait, and then at the end, you have '*'*
– right before “*/” .
It’s only the character combination */
that stops the comment. I made a buffer now instead, with a separate rule. Didn’t find anything about non-greedy matching in ocamllex.
Yes, but the string “/" also matches the regexp "_”. So the input
/** abc */ x y z /** def */
will yield a single docblock, containing the entire line. Or at least, IIRC, that’s how lex will work.
For closure, this is what I ended up with:
and docblock_comment buffer = parse
| "*/" { DOCBLOCK_AS_STR (Buffer.contents buffer) }
| '\n' { new_line lexbuf; docblock_comment buffer lexbuf }
| whitespace_char_no_newline+ { docblock_comment buffer lexbuf }
| _? as s { Buffer.add_string buffer s; docblock_comment buffer lexbuf }
| eof { failwith "unterminated docblock" }
Ah, that should work (IIRC b/c) longest-match wins, and that’s “*/” .
It is simpler to write what you wrote, than to calculate out the regexp, even if (to me) the regexp is … more satisfying grin.
I didn’t find any way to make a non-greedy regexp, so. No choice.