Sure, I can understand that many people don’t find this obvious. But the fact that the Google C++ Style Guide forbade creating (as opposed to using) this sort of ad-hoc polymorphism from almost all business logic, speaks volumes about the dangers of the stuff. And heck, I’m a rabid advocate of traits/modular implicits.
You are undoubtedly right. And this is true of Rust, too: often when I was writing Rust, I had to dig into trait implementations to figure out what the hell was going on with some type error.
Here’s the thing, though: It is crystal-clear to me that without modular implicits, OCaml will not get a high-quality matrix-math tower. B/c there’s just too damn much module-this-and-type-that going on to make code even remotely readable, unless you use ad-hoc polymorphism. I spent time extending the Rust sparse
matrix library and it became very, very clear to me that if we want the conciseness of Python’s numpy
/scipy
, we’re going to have go with modular implicits.
Another thing: sure, in the example, because the change cause length to no longer apply, you got a type error. But … I mean, so what? People change the types of members of a struct all the time, and this causes breakage. This isn’t new news. In this case, the programmer didn’t construct a robust abstraction[1] around his data-type, so why should he expect that if he changes the data-type, all users will be unaffected?
[1] And that wouldn’t have been hard to do. This “ClientIdList” could have been a type, with a “size” or “length” method, and then, haha, when its implementation changed from “Vec<…>” to “Option<Vec<…> >”, the author would have had to decide how to implement the size/length method. I mean, I don’t see the issue here: this happens even without ad-hoc polymorphism.
I’m not sure what the compiler is supposed to do if the code type-checks. Switching to a different member of a type class effectively changes the implementation and therefore potentially the semantics of the function involved. I don’t see how that is different from changing the implementation of a monomorphic function - if you change a function “double” to return triple its input, your code will still typecheck and build but its behavior will probably change. I don’t see anything surprising in the examples given in the blogpost.
Personally I’d love to see something like typeclasses in OCaml. (But only if we can stop calling them “modular implicits”, which is a perfectly meaningless term unless you’re already quite familiar with OCaml. Its an implementation strategy, not a feature!) I think the convenience is worth the risk of misuse. But it is regularly pointed out that this is a very long-term feature, presumably because it will take a long time to make sure its done right.
That’s the point: we are not happy that it type-checks.
Except that it is not you who performs the act of “switching to a different member of a type class”, it is the compiler, and it does it silently and perhaps unexpectedly.
Of course you cannot expect the compiler to catch downstream bugs statically when you change the semantics of a function without changing its interface (i.e. type); but you expect it at the type level at least, that’s a major selling point of static types. Instead, in this typeclass example, you change a type definition and, instead of failing to compile, code that depends on it silently switches to using some other code.
I guess that depends on what you mean by “silently”. The programmer explicitly changes the type (but not the type-class); the compiler then does exactly what it is told to do, which is check the type.
I agree there is an issue to be addressed here (“problem” is too strong IMO), but I don’t think its a typechecking issue. Rather its a problem inherent in the very notion of a “type class”, which eliminates the constraint of “one implementation per function name+type”. Which means our intuition that all functions having the same name and type have the same semantics. You can add anything to type-class “has-length”, even stuff to which the concept “length” does not (intuitively) apply, and you can define “length” functions that do not match user intuitions about “length”. So you have the same issues even without “refactoring”: code that typechecks but behaves in unexpected ways.
In the blog’s example, a type was switched to “Maybe” (or “Option”). One could argue that the bug is that those should not be members of the “has_length” typeclass.
The blog title is “Ad-hoc polymorphism erodes type-safety”. A more accurate title might be "Type-classes sever the one-to-one relation of types to semantics.’
Dunno about that, but it would definitely become more difficult to reason about OCaml code. You could no longer assume that each function (name) has exactly one meaning. But that could probably be addressed by smart tools.
Yes, in general that could be the conclusion, which brings up the point that the typeclasses must be designed very carefully.
But in the Haskell example they are members of Foldable, which probably makes sense for Haskell, and length is a function in that typeclass.
In the Rust example, as I mentioned they are using .iter().count()
to get the length of a Vec, and it is the Iterator trait that is in play. Iterators would not normally be used for this in Rust, so I think this is a contrived example. There is nothing wrong with the design of Iterator in this case, it was just misused.
Namespace integration. Lack of a standardized mapping from module paths to file system paths is a huge problem for build programs.
Ooo, yeah. I begin to see why they say “modular implicits” is a (very) long-term project.
I am very skeptical about that argument. “Bugs do not exist; computers do what they are told to.”
I get your point about assigning meaning to names, though, and I do agree that having a “length” instance for options is… weird? Even though, I can conceive that a more involved refactoring might end up with a head type having a meaningful notion of length, e.g. you refactor your client list
to a client list list
for some reason.
I haven’t programmed extensively in Rust nor Haskell, but I have done quite a bit of Coq. In Coq, typeclasses are both a blessing and a curse.
- Blessing: it significantly eases up writing (and reading!) even simple expressions, which matters a lot more for Coq than for OCaml, because of dependent types. It also backs the automation infrastructure, which is an additional motivation for Coq to have typeclasses, by contrast with OCaml.
- Curse: sometimes it does unexpected things. Often there are several possible paths to building an instance and they would not reduce as you would like them to, for instance. You have to be very meticulous and principled when designing a set of instances. Or, there is no path because you’re missing an instance of C, and you have a hard time identifying what C even is, because the library you’re using has a layers of very abstract notions that are remote from your concrete object. When typeclasses go wrong, debugging it is annoying, requires specific skill and, more importantly, it heavily uses the interactive nature of Coq: you can easily stop at a specific point, go back and forth, inspect the current state of your program, show types, show implicit/inserted stuff, show available typeclass instances… (in other words, Coq is its own debugger on steroids). Also, before debugging, you must detect that there is a bug. On that front, Coq being a proof assistant, you will end up writing proofs which will catch the bug; all that, without running your code and then some time later (e.g. in production) noticing the bug (if you’re lucky). You don’t have that facility in OCaml.
Even though, I’m very happy that Coq has typeclasses, and I would be happy that OCaml gets some kind of it too. It’s worth the cost IMHO.
Since we’re talking about OCaml and its lack of ad-hoc polymorphism, which proposals like “modular implicits” are meant to address, in the context of this recent article that talks about “type safety” issues (and I have some professional expertise in what we mean by that nebulous phrase in the C++ community, as a member of committee at my day job that deals with the AUTOSAR standard), I’d like to offer my unpopular opinion here that the problem these discussions are dancing around is basically the Liskov substitution principle.
The whole reason this isn’t as big a problem in Rust as it can be in Haskell is that Rust doesn’t have any subtyping relations (except for with its lifetime kind) and Haskell does. If OCaml were to be expanded with modular implicits, all the trouble associated with subtype relations that show up in Haskell would be replicated in OCaml, along with the additional complication that you’re not going to get canonical coherence in OCaml like you have in Haskell because it’s incompatible with the OCaml module language. The best we’re going to get is the kind of coherence you see in Scala, and sadly, the essay we are talking about here did not discuss how these examples would look in Scala.
I think there is a connection here to the other basic confusion that tends to arise, especially in OCaml programs that use the object language: the problem that subclasses may or may not be subtypes, and it’s really easy to make a subclass that type checks as a subtype, but cannot actually be substituted with any other instance of its supertype. That’s the issue this article is surfacing in Haskell, and the fact that the only way to discuss that issue in either Rust or Haskell is to show it with traits and type classes is, I believe, leading to some distractions here.
I’m not following, OCaml has plenty of subtyping with objects and polymorphic variants. I thought the one core issue really was coherence of instances?
Translation please? Presumably “coherence” is a term or art?
Coherence is when you can statically determine a single, unique instance per type class and type.
And if you don’t have coherence then type inference can fail, requiring type annotations – correct?
If you don’t have coherence then ideally the typechecker should fail the build telling you that it can’t identify a unique instance, and list the instances it found so you can debug.
EDIT: I will add that type classes/modular implicits don’t really make sense in idiomatic OCaml (at least to me). E.g. here’s one of the motivating examples in the paper:
module type Show = sig
type t
val show : t -> string
end
let show {S : Show} x = S.show x
implicit module Show_int = struct
type t = int
let show x = string_of_int x
end
But…we don’t program like this in OCaml now. Instead, we pack all the members needed into the original module that contains the type. Or we create a new expanded module that include
s the original module and expands on it. The point is we pack in all the members needed for a module to conform to all the interfaces we need it to. E.g.,
module Int = struct
type t = int
let show = string_of_int
(* ...other members... *)
end
Now Int
automatically conforms to Show
, and with its other members, to any of a bunch of other interfaces. To me this is one of the basic principles of modular programming in OCaml, and trying to move towards a more Haskell style where we split up and parcel out a single module’s functionality into many small modules to make them ‘instances’…seems like an unnatural style. OCaml modules already support subtyping and conformance. Haskell type class instances don’t work in that way. The design spaces are different and imho that’s OK.
Mmmm … in Rust you don’t have that property either. The most common example being the “collect” trait that converts an iterator into a collection type. You often have to write:
e1.collect()
in order to get a collection of type T [IIRC]. So for instance, you light want a vec, but you might want something else (list ?) It’s been nearly a year since I hacked Rust, so I don’t remember the precise details, but this isn’t a rare thing.
It’s mentioned in the Book: Traits: Defining Shared Behavior - The Rust Programming Language
This restriction is part of a property called coherence, and more specifically the orphan rule, so named because the parent type is not present. This rule ensures that other people’s code can’t break your code and vice versa. Without the rule, two crates could implement the same trait for the same type, and Rust wouldn’t know which implementation to use.
Example:
$ cargo build
Compiling rc v0.1.0 (/Users/yawarquadiramin/src/rc)
error[E0119]: conflicting implementations of trait `S` for type `T`
--> src/main.rs:12:1
|
7 | impl S for T {
| ------------ first implementation here
...
12 | impl S for T {
| ^^^^^^^^^^^^ conflicting implementation for `T`
For more information about this error, try `rustc --explain E0119`.
error: could not compile `rc` (bin "rc") due to previous error
The problem I was indicating with not having coherence, however, is the other problem: the logic type checks but different instances are visible at different points in the logic and their behaviors are different in ways that lead programmers to assume that the type checker has provided a guarantee that it does not actually provide.
And I think that’s not all that different from the problem that a subclass often type checks as a subtype while it’s easy to write a subclass that cannot actually be substituted for any other instance of the superclass.
Accordingly I don’t think the essay is really describing a problem with ad hoc polymorphism. It’s describing a problem with the Liskov substituton principle in general— one that comes with the territory and you just have to live with it, in any language that allows substitution.
Yes, this means you don’t have coherence. Any sane implementation will fail the compile with a coherence error (‘multiple instances found’ etc.). The modular implicits paper also emphasizes this.