Modular Implicits

compiler
faq

#41

I dunno, I’ve seen a lot of people praising OCaml’s explicitness and clarity. Typeclasses often make code reading quite tedious. I’d rather like to see something like this

module String_map = Map.Make(String)
String_map.map f map (* of Map.map (module String) f map *)

instead of haskell’s

map f x

and of course it’s better not to use typeclasses in loops as haskell wiki states, so you still have to monomorphise stuff manually.

BTW both OOP classes and typeclasses require virtual functions (for objects and types respectively), is it possible to have an entirely static ad hoc polymorphism, or does it always require dynamic calls? @lpw25, do modular implicits have some dereference overhead?


#42

That’s not true for rust traits unless you use trait objects. I think ocaml functors can be optimised in a similar way as well, am I wrong?


#43

There is this post “Modules Matter Most”, where Robert Harper highlights the importance of the many-to-many relationship between modules and module types.

This is why I’m looking for modular implicits: a true modularity with a good support from the compiler to choose among the various module variants.


#44

So ad hoc polymorphic functions require dynamic dispatch in rust, right?

fn (v: Trait) { v.meth(); }

would require dynamic dispatch?


#45

is it possible to have an entirely static ad hoc polymorphism, or does it always require dynamic calls? @lpw25, do modular implicits have some dereference overhead?

That requires monomorphisation (i.e. a different copy of a polymorphic function for each type it is used at). Rust does that, as does MLton for Standard ML. However, it can’t be done in general for OCaml, a number of features allow new instances to of polymorphic functions to be generated at run time (e.g. polymorphic recursion). It also has downsides in terms of code size and compile time.

However, monomorphisation is really a special case of specialising a function to known arguments. Flambda already does this. For example, given:

let sum l = List.fold_left (fun acc x -> acc + x) 0 l

flambda can create a new copy of List.fold_left specialised to (fun acc x -> acc + x). Whether it does this is based on its own heuristics that try to judge whether the improvements from specialising the function outweigh the cost of the increased code size.

The same should work for implicits. Given:

module type Print = sig type t val print : t -> string end

let print {P : Print} x = P.print x

implicit module Print_int = struct
  type t = int
  let print = int_of_string
end

let print_ints (l : int list) =
  List.map (fun i -> print i) l

then flambda should automatically specialise the print function to the (implicit) argument Print_int and the whole thing should reduce down to just calling int_of_string.


#46

It’s certainly possible that I’m being unreasonable. I’m new to the community, I don’t understand the problems nearly as well as those on the inside, and I might be letting my enthusiasm for having a feature I’d really like override reality.

I don’t doubt that labor is scarce. It’s always the thing in shortest supply. Wants are infinite but there are only so many smart people to go around. :slightly_frowning_face:


#47

Yes, in that case you are using a trait object, and that uses dynamic dispatch. You can avoid the dynamic dispatch if you do

fn <T: Trait> (v:T) { v.meth(); }

but of course it depends on what you need to do (e.g. if you need to map that function over an iterable and you are dealing with Vec<T> then it’s fine, but if you have a Vec<Box<Trait>> you need dynamic dispatch).

If somebody needs a bit more context, there is a brief introduction here: https://doc.rust-lang.org/book/second-edition/ch17-02-trait-objects.html


#48

I think you’re missing the point here, and why the haskell way is ultimately far more powerful. Ideally, you want your code to be written against interfaces rather than specific implementations. This allows you to reuse your code as much as possible rather than duplicating it, which is one of the basic principles of software engineering (when you correct a bug, it applies everywhere, etc). Haskell’s use of map rather than String_map.map means that map is calling into the Mappable interface. Later on, when they switch to a different data structure, that piece of code will still be relevant, while the OCaml version will need to be rewritten. The beauty of type classes is that this ‘writing against an interface’ is seamless and free. Every time you call a function in a type class interface rather than a concrete function, you get the ability to switch the implementation to some other one. In OCaml, you’d need to functorize your code explicitly to get this kind of behavior, and each instance would need a functor application. It’s just not practical to scale this up – are you going to parameterize every function with multiple functors? It’s just too heavyweight. This is what we want with modular implicits.


#49

Leo, we need to clone you. I hope Jane Street is at least freeing you from dealing with trivialities, and that they realize any minute of yours they take up causes massive damage to the future of OCaml.


#50

From the outside the state of affairs looks like this: There is a clear vision for the path forward for the OCaml ecosystem: Multicore, algebraic effects, flambda improvements, modular implicits. At the same time these are man-years away. This sounds like an ideal moment for industrial or educational parties (INRIA, facebook, tezos, whoever) with an interest in the future of the ecosystem to invest in financing one or a few additional long-term developers who work on development of the core language exclusively.


#51

Multicore itself is not that far away. Actually what you’re seeing is fairly predictable: the theory stuff takes very long because there are very few people who can do it, whereas engineering is a matter of coming up with a design and then ramping up the number of developers. The fact that we had an ostensibly functional modular implicits branch was misleading – it really was just a demo, but it got everyone salivating for something that’s now years and years away.

I do wish our theorists were more into open communication. Type theory isn’t rocket science, but you wouldn’t know it from the way the lid is kept on these things. More blog posts pointing to links to relevant research could lead the way to expanding the expert base to include collaborators who could work on different pieces of the puzzle. The things Leo is working on now could easily tempt haskell theorists to cross over: with modular implicits + typed algebraic effects, OCaml becomes a more powerful, cleaner version of haskell. Heck, the task of properly documenting the type checker alone would allow far more experts to feel comfortable diving in.


#52

Can I ask those of us who are not language design or type theory experts to please refrain from making presumptions about how hard it is, how long it ought to take, what a productive and effective mode of working and communicating is, etc. Writing the sort of “pop science” blog posts to broadcast what is currently going on is exactly the sort of distracting time sink that was mentioned a few posts back as something to hope the experts don’t have to do. I could be misreading things, but my perception is that there is serious under-appreciation of the degree of difficulty, background knowledge, and concerted singular effort that is inherently required by these projects.


#53

I’m salivating at the thought. Unfortunately it sounds like there’s a shortage of skilled minds to work on this, and that even a much larger budget wouldn’t fix that.


#54

If that’s really the problem, then organizing the industrial community to cough up the money isn’t very hard. (It might seem hard but I’ve managed that sort of thing in the past and it’s more tiring than difficult.)

However, I’m getting the impression this is not considered the problem, or at least it is not the limiting problem. Rather, the bigger problem seems to be the shortage of top rate minds to apply to the task. (If I’m wrong on that, someone please correct me, because if the problem is just money I can help organizing that part. Unfortunately I’m too old and stupid to help with what seems to be the actual hard part. which is helping with the design.)


#55

Money always helps IMO. Just adding more dedicated engineers means more features get implemented, which leads to community growth, which leads to more companies using the language and having vested interest in its success, more university departments choosing to teach it, and more researchers becoming interested in doing research in it. In programming languages, while ecosystems aren’t everything, they drive everything else.


#56

Taking about documentation, here are my $0.02

I always wish some people can start documenting typechecker. After I talked to some maintainers and major contributors, I came to conclusion that this is not going to happen. Indeed typechecker implementation is really complicated (parmatch.ml for example, but I agree with you it’s not rocket science), I think it’s also true that some parts of the typechecker are “bad” code. As maintainers keep evolving and fixing typecheckers in the past 20 years without major clean-up (as far as I know), it is reasonable its code base becomes messy. If I remember correctly, I saw a maintainer in Mantis said “we really need to clean up env.ml”. Drup started making some contributions toward a cleaner typechecker and I think this is a good direction.

That said, however, if one starts with small tasks and read necessary papers, I believe he can still have an ok or good understanding of the typechecker.


#57

What @objmagic said is mostly correct (although I think lot’s of people here severely underestimate the complexity of a production-grade typechecker). There are HACKING files that gives introductory instructions and reading materials.


#58

Does the typechecker have an exhaustive test suite? I’ve found in the past that the time needed to write one often is paid off handsomely in future ease of refactoring. You get a lot of confidence making bold changes if you can thoroughly check that they haven’t broken something subtle.

Although it might be hard for amateurs to help refactoring the typechecker itself, it might be possible for them to help with tests.

(Someday, sadly I suspect after I’m too senile to program any more, I hope that the tools are good enough to conveniently verify code like this so that such crude measures as tests are less necessary. :slight_smile: )


#59

Wow @Drup – those hacking files are great! They definitely serve as excellent first pointers into the code.

I think the typechecker code suffers from a combination of factors:

  1. Most old OCaml code isn’t well documented. This is pretty consistent from my experience, and may stem from the fact that with .mli files, people felt comments belonged almost entirely there rather than in the .ml files.
  2. Old OCaml design patterns were… suboptimal. It was seen as fine to heavily use exceptions and global state. This makes the code often as hard to read as code in other languages with heavy global mutable state usage, except you also lack type annotations.
  3. The stdlib wasn’t as developed when the code was written, leading to duplication of similar code patterns over and over.
  4. Pure functional data structures start to break down once you make them more complex. Having had experience writing a compiler with inference and a complex data structure in haskell, it’s a bit of a nightmare (surprising, but true). Adding mutable references for annotations and types makes the data structure much easier to deal with, but you’re then out of the realm of maps and folds, and in the world of iters and mutable state, which is often harder to reason about.
  5. The typechecker has to not only be correct, it also has to be fast, and this means that there are often hacks doing non-obvious things for the sake of efficiency.
  6. Though the algorithm isn’t conceptually complex, there are many subtleties, and without heavy commenting (which doesn’t exist, see above), a user trying to understand the typechecker can easily get lost.

#60

Could we keep this topic a more focused on Modular Implicits? Feel free to open another topic if you wish to discuss compiler or type-checker hacking in general.