Proper style when discriminating over a value with boolean functions?

The following is a slightly contrived example, but it explains the style question I’m trying to understand.

Say that you’re trying to write a little lexical analyzer by hand. You want to discriminate between a number of classes that a character c could be a member of, and rather than using match constructs like 'A'..'Z' | 'a'..'z' you would like to call some boolean function like is_letter c instead.

(Note: I said this was slightly contrived, but it reflects a real problem in a slightly different domain.)

Say there are a dozen such classes you’re trying to discriminate between.

Now, you could do a match c with followed by a lot of lines with guard conditions expressed with when like

match c with
| c when is_letter c -> something
| c when is_digit c -> something_else
| c when is_ctrl c -> yet_another_thing

etc. etc.

Or you could do

if is_letter c then something else
if is_digit c  then another_something else
if is_ctrl c   then yet_another_thing

etc. etc.

Both of these choices feel kind of unsatisfying to me. The match is really not a match at all, it’s just an endless series of guard expressions that don’t really use the pattern variable for anything, and it looks ugly to me.

The if also feels kind of gross, perhaps because it looks too much like the worst of what you would do in other (lesser) languages.

In Lisp, I’d just use a cond, which is purpose-built for discriminating against a large set of cases defined by boolean expressions, but it doesn’t feel like there’s a good choice here in OCaml.

Which of these, if either, is better style? If neither, what is the correct way to do such a thing with good style?

1 Like

In general I’d prefer the latter (though with else at the end and if aligned in front, per the style guide on ocaml.org), for the reasons that you dislike the match. The if/else drudgery is, honestly, what it is. But if something and something_else are long enough expressions, I might prefer the match simply because longer expressions work already with it, without parens or beginend.

There’s another option though:

match charclass_of c with
| Letter -> something
| Digit -> something_else
| Ctrl -> yet_another_thing
2 Likes

Okay, so I’m not imagining it and there’s no more elegant solution I’m guessing.

I think I’ll do the “if” style when I hit stuff like this (and yes, I should have formatted it per the style guide, I’ve partially edited to fix that.)

That said, this is one of the few places where I find the lack of the equivalent of a Lisp cond (i.e., multi-way if) irritating. It would be neat if I could just do

cond
| is_letter c -> something
| is_digit c  -> something
| is_ctrl c   -> something
| true        -> default_thing

etc.

Another alternative: it would also be cool if match had some form where it permitted clauses that were “all guard” with some sort of syntactic sugaring.

However, none of that exists, so I’ll live with the if.

1 Like

I believe the difference in readability in both cases here is small: both emphasise a left-to-right case-by-case structure, which is good.

Further advantages and disadvantages when using match:

  • It is easy to extend with additional matches against literal characters.
  • If you want to match against named constants, you need to use when but could use is directly when using if
  • If requires to use begin/end for statement blocks, whereas match does not.
match c with
| c when c = Const.escape -> ...
...

if c = Const.escape then ...

All in all, this is mostly a matter of taste.

Incidentally Erlang has exactly what you’re asking for - a construct that’s like pattern-matching except you have a series of guards instead of a series of patterns. Erlang calls it if though, so technically ‘if’ is the alternative in both languages :slight_smile:

Anyway it’s an example of prior art if you want to argue that OCaml should have such a thing.

I think I’m too new in OCamlLand to argue for such things even if my current impulse is to want one. :slight_smile:

If slightly edited your example. Not sure it is an improvement, but it should make it obvious that a match is used for its side effects so to speak

3 Likes

That’s an interesting way to do things. It’s weird but looks somehow better than the alternatives I’ve thought of. Thank you!