I’m struggling to find a good code pattern to share algorithms code across slightly different types. For example, a library for sets and maps using a similar tree structure:
type set = Leaf | Node of key \* set \* set
type 'a map = Leaf | Node of key \* 'a \* 'a map \* 'a map
(the real example is the patricia-tree library for one, but I also have other use cases in mind, such as a union-find library with multiple variants, which may-or-may-not include a rank or size field depending on union strategy).
Because most functions (find, iter, map…) heavily and recursively pattern match on both types, a lot of logic can be shared, but doing so is non-trivial. Here is a short list of what came to mind:
-
Duplicate the code, which is a non-solution, but is what Stdlib does for
SetandMap. I specifically want to avoid this as shared code is easier to maintain. -
Implement
setsas unit maps directly, which I want to avoid for memory use reason. -
Use meta-programming (i.e. with CPPO or similar tools). This is probably the solution with the best “performance” which avoids duplication, but it is very heavy to use (every pattern matching branch is wrapped in macro calls…). It also breaks the LSP, which does not help with maintainability.
-
Use a view type: instead on matching on the stored type
x, match onview xwhereviewcasts to a common type (idfor maps, cast to aunit mapfor sets), and define all the shared code in a functor that takes the type and view as a parameter. This is the solution we use in patricia-tree. It allows great code sharing without storing the extra unit value for sets. However, it noticeably degrades performance for sets, because of the extra allocations needed. Unfortunately, I’ve been unable to force inlining of theviewwhich may alleviate this problem, because theviewfunction is a functor parameter. -
Use a view matching: essentially the same as above, but instead of casting to a common type, each implementation would define their own pattern matching function with the same interface:
view_match: leaf:(unit -> 'a) -> node:(key -> 'b -> 'b t -> 'b t -> 'a) -> 'a. This avoids the problem of allocating the intermediate type, but is very heavy to use, and makes things like pattern matching on pairs hard.
I’m wondering if anyone can propose alternative ways (or improvements to those I mentioned here, such as how I could force inline the `view` calls).