Examples of using Base Map

datastructures

#1

Hello, I’m struggling to understand how to use Base’s Maps. I found this bit of documentation which helped me figure out how to define my Map type:

type rule =
  | Substr of string
  | Regex of string

type account_rules =
  rule list Map.M(String).t

But now I’m not sure how I can create a map satisfying this type. I also tried looking at the tests for Map but they only created more question marks :slight_smile:

Could anyone provide some examples of how to use Base Map?


#2

On IRC this has been suggested by some friendly folks but I’m still getting a syntax error:

module String_map = Map.M(String)

String_map.add String_map.empty ~key:"x" ~data:1
(* ^^^^^^^ error here *)

I’m still getting a syntax error as marked above though.


#3

Try this.

let empty = Map.empty (module String)
let numbers = Map.of_alist_exn (module String) ["three", Substr "three"; "four", Substr "four"]

You don’t need to use the functor to instantiate it. The functions in Map are polymorphic over the type of the key and of the data; you just need to pass in the first-class module for the key type.

y


#4

And the syntax error there is because you need to let-bind the result of calling String_map.add. But even if you fix the syntax problem, that expression won’t work, because that module doesn’t contain the add function. The module is just used for passing to functions like Map.add to tell it which key type is being used.

y


#5

What’s the reason you guys made such a radical change for Base?
Are you now going to change your entire codebase to conform to the new interface?


#6

We made the change because it’s more modular: now, you don’t need every module that wants to be a key in a comparison-based datastructure to explicitly refer to that datastructure. You just need comparability to be supported.

We’re not removing the Int.Set and and Int.Map modules, but we are in the process of making Core follow the same standard, so that you can write:

Map.empty (module Int)

using both Base and Core, and in the latter case, get something that has a type that is equivalent to Int.Map.t.

y


#7

Is this possibly a Core-specific thing? The standard library’s Map interface is pretty minimal as-is, and doesn’t require any reference to the specific data structure.


#8

Well, it’s to solve some specific problems not solved by the stdlib’s Map module. Core’s Maps do a few nice things:

  • They’re polymorphic over both the key type and data type (stdlib maps are polymorphic only over the data, and have to be instantiated by functor for each key type.)
  • Even so, they don’t let you do unsound things by mixing together maps (say, in a Map.merge) that use different comparison functions
  • They also let you implicitly construct new maps via derived deserializers (via sexp and bin-io).

To get all these to work, we added some per-module logic (inserted in a standardized way via functor). But in Base, we figured out a new cleaner idiom that allowed us to do this just once per capability (i.e., supporting comparisons) rather than per-data-structure.

y


#9

So you’re storing the comparison function in the data structure, which also has the added benefit of deliberately breaking polymorphic comparison. Neat.

Do you find that first class modules improve the user experience significantly? It certainly should be nice not to have to come up with new module names.

Also, given your decision to go in this direction, what would you say is a good heuristic for choosing first class module vs functors nowadays? I can see a potential problem with inlining, where FCMs are more dynamic than functors and thus not easily inlined by Flambda. The impact is minimal with a single comparison function, but with more functions, this could perhaps be an issue?


#10

This looks interesting,

  • How hard is it to feed your own Int module without using Base.Int?
    If so, does the user need to use janestreet extension to the language

  • It is still sound that if user create two customer Int module, I understand
    if Base.Int makes the comparator abstract, that it would be sound, but on the user
    level, if he/she does not make the comparator abstract, would it still be sound?

Having a cursory look at base code base, I am not sure if I am alone, the codebase
is so hard to read (too many indirections…)


#11

It’s definitely true that the documentation around Map is insufficient. We’re working on it, and I also hope to refresh the chapter in RWO to cover the way Maps now work in Base (and will soon work in Core as well.)

It’s definitely most convenient if you use Base’s Int module, but it’s easy enough to do it with your own module.

module Foo = struct
   module T = struct
     type t = int * int
     let compare x y = Tuple2.compare Int.compare Int.compare
     let sexp_of_t = Tuple2.sexp_of_t Int.sexp_of_t Int.sexp_of_t
   end
   include T
   include Comparable.Make(T)
end

Gives you a module Foo with the appropriate comparator in it, and then this:

let m = Map.empty (module Foo)

lets you create a module keyed by Foo. You need to write a sexp-converter and a comparison function for this to work, because maps both need comparison and the ability to serialize the key for generating useful errors. It’s of course nicer to do this with the appropriate PPXs:

module Foo = struct
    module T = 
       struct  type t = int * int [@@deriving sexp_of, compare]  end
    include T
    include Comparable.Make(T)
end

There’s no soundness issue, because the functor mints a fresh type each time it’s called on a different input module, ensuring that different comparison functions have distinct type witnesses.

y


#12

This is a cool encoding, it may be worth a blog post : )
The only downside is that it is a bit user heavy to create the keyed module. Is it possible to make it as lightweight as just feeding a comparison function while stay
sound?


#13

@bobzhang seeing you’re an Admin consider moving your more in-depth discussion into a thread with an appropriate title. (Discourse allows moving posts to a new thread.) I assume more people are interested in the more in-depth discussion you had with Yaron here but might not see it because it’s “hidden” in this Newbie question thread :slight_smile:

PS. I haven’t yet tried using Map after I got things to work with List.Assoc (perf is not an issue right now) but I’ll get to it eventually. Definitely big thanks for the example @Yaron_Minsky :+1:

FWIW I find it sometimes a bit confusing to trace what the examples in RWO refer to; is it Core, is it Base, what version, etc. I think co-locating this kind of more tutorial-style documentation with the actual API documentation might be an approach worth considering.


#14

Playing around with it a little, any idea why

module M = struct include (val Base.Comparator.make ~compare:(-) ~sexp_of_t:(fun _ -> Sexp.Atom "dummy")) end
let m = Base.Map.empty (module M)

works but

let m = Base.Map.empty (module struct include (val Base.Comparator.make ~compare:(-) ~sexp_of_t:(fun _ -> Sexp.Atom "dummy")) end)

complains of

Error: The type comparator_witness in this module cannot be exported.
Its type contains local dependencies:
%M.comparator_witness

#15

Because there is an existential type and the module must be bound (as in your first case) before we can access its components. You can find an explanation in this article page 7 in the section on Path Elaboration.


#16

Wow, a great answer with an exact reference! Kudos.

I guess first class modules are more counter-intuitive than I realized, since beta reduction doesn’t work as expected.


#17

Modules have never been stable by substitution, first class modules don’t really change that.

For example, in this, you can’t substitute M by its value (try to imagine what would be the type of a).

module M : sig type t val x : t end = struct .... end
let a = M.x

#18

Fascinating. I guess it shouldn’t be surprising that the thing preventing module substitution is precisely that which separates modules from records, that being their contained types (which need static paths).

Getting back to the topic of the new Base Map, here’s how I see it right now:

Pros:

  1. No need to come up with new module names.
  2. No need to construct map modules.
  3. Can’t accidentally use structural comparison.

Cons:

  1. Extra complexity of library code.
  2. Still need to construct key comparison modules (something that could be done with anonymous modules in the functor method).
  3. Further incompatibility with rest of ecosystem.
  4. More dynamic nature (storing the comparison module in the data structure) means Flambda possibly has less opportunity to inline.

#19

This actually reminds me a bit of Chris Okasaki’s approach in https://github.com/chrisokasaki/scads/blob/e78233ac6a787b7c66b44cd6139392418b214eb9/design/heaps.md#problem-1-dont-mix-two-different-orderings-in-the-same-heap (a heap data structure which stores a reference to its ordering module when it first constructs a heap).


#20

It’s worth noting, the type discipline works without actually storing the comparison function in the data structure. See Map.Tree for a version of the data structure with the same type discipline, but no stored up function.

It seems like a mistake to presume what Flambda would do with this kind of code. It’s not at all clear to me what inlining might be possible with these types, though the Flambda devs might have more to say on the matter.

I’m not especially compelled by the compatibility argument. After all, functor-specialized maps aren’t really natural as data-interchange types anyway. (Base’s maps are better for this purpose, since they’re polymorphic.)