I wonder if there has been any performance evaluation, of e.g. Set and Map, that compares the functorial (Stdlib
Make functor style) versus “comparator” (Base-style) interfaces? To be concrete, suppose we define a
Make functor over the tree data structure underlying the Base/Core Map module as below to obtain a type whose values are just trees and the compare function is obtained from the functor argument rather than the individual values.
Is it immediately obvious that this functorial interface should have the same performance as the comparator-style interface exposed by Base by default? I am thinking about e.g. the compiler’s ability to inline compare functions in the two cases. With the comparator interface, IIUC there is only one copy of the compiled code for e.g.
Map, and furthermore the compare function needs to be read out of a record field each time the external interface boundary is crossed. So it seems “unlikely” that the compare functions can be inlined. On the other hand, with the functorial interface, is there at least the chance that the compare function could be inlined?
There is also a question about whether it matters that there is an “extra” indirection every time the external interface is crossed (due to the toplevel record containing the compare closure and tree itself). For instance, naively it seems that implementing
Map.of_alist outside of
Map would need to do perhaps-meaningfully more work than defining it inside
Map. Has this sort of thing been evaluated and understood to be a non-issue?
I’m mostly wondering if anyone has expectations based on understanding of the compiler, or has been learned already from experiments with recent compilers (i.e. flambda).
For other context regarding functorial interfaces, there is definitely a point that favors the comparator-based interface where it can avoid compiled code duplication via a quadratic (number data structures times number of key types) number of functor applications for e.g.
String.Map modules for each key type like
String and data structure like
But, IIUC there are several drawbacks of the comparator-based interface, independent of performance considerations. Comparators are values, in particular records with a field of function type, which means that:
Marshaling values whose representations involve a comparator-based container will require enabling support for closures, which is slower and much more stringent with respect to interoperability. In use cases I commonly see, there are data structures with a lot of implicit sharing where the structure of the sharing is irregular, and so making it explicit in order to use a serialization system that does not support sharing well would be very invasive at the source code level, and a lot of work.
- Comparators are extremely inconvenient when used in recursive modules, since the key module will need a comparator, and thereby violate the recursive module safety check. For example, code like an analogue of the manual’s example of a tree with nodes that have sets of children becomes intricate and requires duplicating the signature numerous times.
Thoughts? Am I the only one who would rather keep the functor-style APIs (in cases where the number of functor applications does not explode problematically)?
Functorial interface to
open Core module Make (Key : sig type t [@@deriving compare, sexp_of] end) : sig type key = Key.t module Key : sig type t = key include Comparator.S with type t := t end include Core_kernel.Map_intf.Make_S_plain_tree(Key).S val compare : ('a -> 'a -> int) -> 'a t -> 'a t -> int end = struct module KeyMap = Core.Map.Make_plain (Key) module Key = KeyMap.Key type key = Key.t include KeyMap.Tree let compare = compare_direct end