So I wanted to be able to convert a string input into a string list, with a separation matching a certain pattern. I found out about the Srt module and its regexp functionnality wich seemed to be the thing I was looking for.
But I can’t quite understand how to use it.
I have a list of keyword to match : let lst = ["pig"; "cow"; "cat"]
I figured I could convert it into a regexp via : let reg = Str.regexp (String.concat "//|" lst)
Using the "//|" as the “alternative” operator and then use let lst2 = Str.full_split reg "pigcowcat"
But has no effect.
Nice it worked, thanks.
But now i need the regular expression for positive float/int, and i’ve came up with this : \\([0-9]*.?[0-9]*\\)
But matches everything except the first character for some reason.
In the concept it should match : 90, 0.9, .9, .90, 90. but also .
How can I prevent the last case?
the last few times I needed a regexp for float literals, I went and copied the one out of the OCaml compiler source (parsing/lexer.mll, search for “float_literal”). It’s not the same syntax as any particular regex engine (at least, not that I know of) but should be straightforward enough to transliterate.
Oh come now. regexps are an amazing tool, and I routinely use them to solve problems that would require significant amounts of code. By shrinking the code down to a single line, it’s actually more comprehensible and easier-to-check. Sure, then you have to really stare at it, but I’m quite convinced that it’s better than a pile of parsing logic.
I wrote a regexp once that exactly parsed an XML start-tag. And used that in a regexp that parsed the next syntactic element (start/end-tag, xmldecl, chars, char-escape) in XML. Was much, much better than something lower-level.
I mean, here’s one: a regexp (for sedlex) for JSON floating-point numbers (read right off the spec, IIRC):
let digit = [%sedlex.regexp? '0'..'9']
let int = [%sedlex.regexp? '0' | ( ('1'..'9') , (Star digit) )]
let frac = [%sedlex.regexp? '.' , (Star digit)]
let ne_frac = [%sedlex.regexp? '.' , (Plus digit)]
let exp = [%sedlex.regexp? ('e' | 'E') , (Opt ('-' | '+')) , (Plus digit)]
let json_number = [%sedlex.regexp? (Opt '-') , int, Opt ne_frac, Opt exp]
There never was confusion, just saying that how can a algorithm determine if -9.37e-5 is -9.37*10^-5 or -9.37*e - 5? It can’t because both make sense, so I chose the second option since you can’t express “e” in a different manner.