Conversion patterns of pattern matching to disjoint patterns

Hi all!

I need to develop algorithm to convert common patterns (without guards) to equivalent disjoint patterns.

For example, for type

type tree = Leaf | Node of tree * tree

I would like to convert program

match x with
| Node (_, Leaf) -> 1
| Node (Leaf, _) -> 2
| _              -> 3

to

match x with
| Node (_, Leaf)                 -> 1
| Node (Leaf, Node (_, _))       -> 2
| Node (Node (_, _), Node(_, _)) -> 3
| Leaf ->                           3

I guess this problem already has solved, but I can’t find any solutions.

I would be grateful for any links to papers or code, keywords, anything that you think will help to find these solutions.

This can easily be done using “matrix decomposition” approaches to pattern-matching analysis. See for example Compiling Pattern Matching to good Decision Trees, Luc Maranget, 2008. Pattern matrices and matrix decomposition are explained on pages 2 and 3.

In your example, the decomposition may go as follows:

| Node (_, Leaf) -> 1
| Node (Leaf, _) -> 2
| _              -> 3

= (split first column in Node|Leaf)

Node(□)  | (_, Leaf) -> 1
Node(□)  | (Leaf, _) -> 2
Node(□)  | (_, _) -> 3
Leaf     | -> 3

= (split first tuple column)

Node(□, □) | _   Leaf -> 1
Node(□, □) | Leaf _ -> 2
Node(□, □) | _    _ -> 3
Leaf       | -> 3

= (split first colum in Node|Leaf)

Node(Node(□, □), □) | _ _ Leaf -> 1
Node(Node(□, □), □) | _ _ _ -> 3
Node(Leaf, □) | Leaf -> 1
Node(Leaf, □) | _ -> 2
Node(Leaf, □) | _ -> 3
Leaf | -> 3

= (push two colums)

Node(Node(_, _), □) | Leaf ->1
Node(Node(_, _), □) | _ -> 3
Node(Leaf, □) | Leaf -> 1
Node(Leaf, □) | _ -> 2
Node(Leaf, □) | _ -> 3
Leaf | -> 3

= (split first column in Node|Leaf)

Node(Node(_, _), Node(□,□)) | _ _ -> 3
Node(Node(_, _), Leaf) | -> 1
Node(Node(_, _), Leaf) | -> 3
Node(Leaf, Node(□,□)) | _ _ -> 2
Node(Leaf, Node(□,□)) | _ _ -> 3
Node(Leaf, Leaf) | -> 1
Node(Leaf, Leaf) | -> 2
Node(Leaf, Leaf) | -> 3
Leaf | -> 3

= (push two columns)

Node(Node(_, _), Node(_,_)) | -> 3
Node(Node(_, _), Leaf) | -> 1
Node(Node(_, _), Leaf) | -> 3
Node(Leaf, Node(_,_)) | -> 2
Node(Leaf, Node(_,_)) | -> 3
Node(Leaf, Leaf) | -> 1
Node(Leaf, Leaf) | -> 2
Node(Leaf, Leaf) | -> 3
Leaf | -> 3

= (remove unreachable rows, final result)

Node(Node(_, _), Node(_,_)) -> 3
Node(Node(_, _), Leaf) -> 1
Node(Leaf, Node(_,_)) -> 2
Node(Leaf, Leaf) -> 1
Leaf -> 3
2 Likes

Hi not_kill !

It is more than a guess, ocaml compiler does it:

type tree = Leaf | Node of tree * tree
let f = function
| Node (_, Leaf) -> 1
| Node (Leaf, _) -> 2;;
Characters 8-60:
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
(Node (Node (_, _), Node (_, _))|Leaf) (* copy paste here when lazy *)
val f : tree -> int = <fun>

I think gasche suggest a really good resource to do this. If ever you want more, you can type keywords like “compiling pattern matching” or “pattern matching exhaustive check” on your search engine.
Have a good day !

Thanks, this is similar to what I need!
But also I would like to minimize count of patterns (in my example after conversion we have 4 patterns, but after sequence of decompositions we have 5 patterns).

I going to converse ocaml pattern matching to miniKanren relational program. And I need as few disjoint patterns as possible.

Do you know how to convert patterns while minimizing disjoint patterns number?
As above, any ideas, links, keywords will be useful to me!

Hi ruyblast!

Thank you for the answer!

Indeed, the problem of construction exhaustive patterns is similar to the problem I need to solve.
But I already have exhaustive patterns and I need to construct equivalent disjoint patterns. In this case, it will be possible to process the patterns independently.

Ok you want to collapse

Node(Node(_, _), Leaf) -> 1
Node(Leaf, Node(_,_)) -> 2
Node(Leaf, Leaf) -> 1

in

Node(_, Leaf) -> 1
Node(Leaf, _) -> 2

(dropping Node (Leaf, Leaf) and ending with a minimal number of disjoint pattern). This issue is addressed by usual pattern matching compilation I think. This algorithm is known as “F algorithm” I think (if I remember well ^^). In such algorithm, you have your matrix of patterns, and a column of corresponding actions (numbers 1 to 3 in your example). The result of the algo enable you both to find every patterns, but also to withdraw patterns that are never reached (which correspond to your minimal disjoint pattern number as far as I understand, after collapsing), by identifying actions that are not taken (because covered by previous set of constructors). Although I do not know how to exactly do that, (so I’ll stop posting so that someone really qualified can answer accurately to your post) I think this will be covered in resource that gasche advised to you, and by the way, this is really linked to pattern matching compilation (with both exhaustive check and redundancy detection, that you get for free in the algorithm), at least for the kind of example you provided.
(And if ever you are desperate, in the worst case, to collapse your patterns, there is brute force algorithm where you exhaustively try to identify full sets of variants from a same constructor (here Leaf and Node) in the same position in rows of your pattern matrix that leads to a same action (here, it would collapse

Node(Node(_, _), Leaf) -> 1
Node(Leaf, Leaf) -> 1

in

Node(_, Leaf) -> 1

because both leads to action 1. Then you could also similarly try to generalize patterns for cases covered up by previous rows in your matrix. Here it would collapse

Node(_, Leaf) -> 1
Node(Leaf, Node(_,_)) -> 2

into

Node(_, Leaf) -> 1
Node(Leaf, _) -> 2

because case Node(Leaf, Leaf), when column 2 is reached, has already be covered by case 1.)
Tell me if I misunderstood what you intended but for now I think this is the tool you are searching for.
Good luck for your project !
Best !