Next priority for OCaml?

Haha. Yeah, I feel that way especially right after I post something and re-read it. Because then it’s no longer a secret.

Yes, that’s true. But if you change the type of A, a compile error would be helpful and may occur more often if you’re not relying on a typeclass/trait that is used so pervasively. I think it’s a matter of degree, not 0 and 1.

What they’re saying here,

It is useful for the sake of future maintenance to stick with monomorphic functions where possible. (You can make a function monomorphic with explicit type annotations or type applications.)

is that you have a choice when you write the original code (before refactoring). You can call a polymorphic length method or a monomorphic one (at least in Rust and Haskell, you have that option). If you don’t have a specific need for polymorphism, then why not use the monomorphic length so the compiler is more likely to help you later during refactoring.

I do think the example is contrived in the Rust case, however, because I would never have used .iter().count() to get the length of a Vec. Iterators are normally only used to calculate summary info when it isn’t immediately available via some other function.

The example showing the use of polymorphic Haskell length function is more realistic. I would definitely have used the polymorphic length in that case.

Very true. It may just be Haskell’s typeclasses that are a little overboard. Rust’s traits do seem very well designed, and I know a huge amount of effort was put into that.

Yet this brings up another possible downside to adding ad-hoc polymorphism – the OCaml team and the community would need to spend a very large effort designing the shared typeclasses/traits (or whatever mechanism used), and also redesigning the standard library and alternate standard libraries to work smoothly and safely with them. Is that realistic? It’s something to at least keep in mind, right?

I don’t follow you. In the example it is the core data structures (Iterator in the case of Rust) that are using polymorphism. The example code (what you’re calling business logic) is just calling their length function.

True, and maybe it isn’t a huge problem. Their assertion is only that “Ad-hoc polymorphism erodes type-safety”, they doesn’t say how much. How much is pretty hard to say, since it depends on lots of factors. One of them is how carefully the traits were designed and applied, as you say.

Absolutely! There are significant upsides to ad-hoc polymorphism. I posted that article because I think it’s also important to consider downsides.

My real fear is that OCaml would become more complex to read and write. OCaml is the only language I’ve found that has good runtime performance and behavior, but is also very straightforward to use. I hope that OCaml doesn’t go down the ad-hoc polymorphism path without a thorough and vetted design, including how it will be applied in the standard library, as well as to primitive types, etc.

2 Likes

I think you pretty much got it. That’s basically what I said: https://twitter.com/yawaramin/status/1696547916834328861

I think the nuance is that A isn’t an instance of the “has-length” typeclass. A just exposes a function that returns an instance t of that typeclass. As a consequence, merely changing the return type to some type u effectively means you’re on the hook for any ad-hoc polymorphic function defined for both t and u, with no help from the compiler.

I didn’t find this obvious. I don’t agree with everything in that blog post, but I learned something from it, and I don’t think it’s right to act like everyone here should understand these issues already.

2 Likes

Sure, I can understand that many people don’t find this obvious. But the fact that the Google C++ Style Guide forbade creating (as opposed to using) this sort of ad-hoc polymorphism from almost all business logic, speaks volumes about the dangers of the stuff. And heck, I’m a rabid advocate of traits/modular implicits.

You are undoubtedly right. And this is true of Rust, too: often when I was writing Rust, I had to dig into trait implementations to figure out what the hell was going on with some type error.

Here’s the thing, though: It is crystal-clear to me that without modular implicits, OCaml will not get a high-quality matrix-math tower. B/c there’s just too damn much module-this-and-type-that going on to make code even remotely readable, unless you use ad-hoc polymorphism. I spent time extending the Rust sparse matrix library and it became very, very clear to me that if we want the conciseness of Python’s numpy/scipy, we’re going to have go with modular implicits.

Another thing: sure, in the example, because the change cause length to no longer apply, you got a type error. But … I mean, so what? People change the types of members of a struct all the time, and this causes breakage. This isn’t new news. In this case, the programmer didn’t construct a robust abstraction[1] around his data-type, so why should he expect that if he changes the data-type, all users will be unaffected?

[1] And that wouldn’t have been hard to do. This “ClientIdList” could have been a type, with a “size” or “length” method, and then, haha, when its implementation changed from “Vec<…>” to “Option<Vec<…> >”, the author would have had to decide how to implement the size/length method. I mean, I don’t see the issue here: this happens even without ad-hoc polymorphism.

3 Likes

I’m not sure what the compiler is supposed to do if the code type-checks. Switching to a different member of a type class effectively changes the implementation and therefore potentially the semantics of the function involved. I don’t see how that is different from changing the implementation of a monomorphic function - if you change a function “double” to return triple its input, your code will still typecheck and build but its behavior will probably change. I don’t see anything surprising in the examples given in the blogpost.

Personally I’d love to see something like typeclasses in OCaml. (But only if we can stop calling them “modular implicits”, which is a perfectly meaningless term unless you’re already quite familiar with OCaml. Its an implementation strategy, not a feature!) I think the convenience is worth the risk of misuse. But it is regularly pointed out that this is a very long-term feature, presumably because it will take a long time to make sure its done right.

That’s the point: we are not happy that it type-checks.

Except that it is not you who performs the act of “switching to a different member of a type class”, it is the compiler, and it does it silently and perhaps unexpectedly.

Of course you cannot expect the compiler to catch downstream bugs statically when you change the semantics of a function without changing its interface (i.e. type); but you expect it at the type level at least, that’s a major selling point of static types. Instead, in this typeclass example, you change a type definition and, instead of failing to compile, code that depends on it silently switches to using some other code.

3 Likes

I guess that depends on what you mean by “silently”. The programmer explicitly changes the type (but not the type-class); the compiler then does exactly what it is told to do, which is check the type.

I agree there is an issue to be addressed here (“problem” is too strong IMO), but I don’t think its a typechecking issue. Rather its a problem inherent in the very notion of a “type class”, which eliminates the constraint of “one implementation per function name+type”. Which means our intuition that all functions having the same name and type have the same semantics. You can add anything to type-class “has-length”, even stuff to which the concept “length” does not (intuitively) apply, and you can define “length” functions that do not match user intuitions about “length”. So you have the same issues even without “refactoring”: code that typechecks but behaves in unexpected ways.

In the blog’s example, a type was switched to “Maybe” (or “Option”). One could argue that the bug is that those should not be members of the “has_length” typeclass.

The blog title is “Ad-hoc polymorphism erodes type-safety”. A more accurate title might be "Type-classes sever the one-to-one relation of types to semantics.’

1 Like

Dunno about that, but it would definitely become more difficult to reason about OCaml code. You could no longer assume that each function (name) has exactly one meaning. But that could probably be addressed by smart tools.

1 Like

Yes, in general that could be the conclusion, which brings up the point that the typeclasses must be designed very carefully.

But in the Haskell example they are members of Foldable, which probably makes sense for Haskell, and length is a function in that typeclass.

In the Rust example, as I mentioned they are using .iter().count() to get the length of a Vec, and it is the Iterator trait that is in play. Iterators would not normally be used for this in Rust, so I think this is a contrived example. There is nothing wrong with the design of Iterator in this case, it was just misused.

1 Like

Namespace integration. Lack of a standardized mapping from module paths to file system paths is a huge problem for build programs.

2 Likes

Ooo, yeah. I begin to see why they say “modular implicits” is a (very) long-term project.

1 Like

I am very skeptical about that argument. “Bugs do not exist; computers do what they are told to.”

I get your point about assigning meaning to names, though, and I do agree that having a “length” instance for options is… weird? Even though, I can conceive that a more involved refactoring might end up with a head type having a meaningful notion of length, e.g. you refactor your client list to a client list list for some reason.

I haven’t programmed extensively in Rust nor Haskell, but I have done quite a bit of Coq. In Coq, typeclasses are both a blessing and a curse.

  • Blessing: it significantly eases up writing (and reading!) even simple expressions, which matters a lot more for Coq than for OCaml, because of dependent types. It also backs the automation infrastructure, which is an additional motivation for Coq to have typeclasses, by contrast with OCaml.
  • Curse: sometimes it does unexpected things. Often there are several possible paths to building an instance and they would not reduce as you would like them to, for instance. You have to be very meticulous and principled when designing a set of instances. Or, there is no path because you’re missing an instance of C, and you have a hard time identifying what C even is, because the library you’re using has a layers of very abstract notions that are remote from your concrete object. When typeclasses go wrong, debugging it is annoying, requires specific skill and, more importantly, it heavily uses the interactive nature of Coq: you can easily stop at a specific point, go back and forth, inspect the current state of your program, show types, show implicit/inserted stuff, show available typeclass instances… (in other words, Coq is its own debugger on steroids). Also, before debugging, you must detect that there is a bug. On that front, Coq being a proof assistant, you will end up writing proofs which will catch the bug; all that, without running your code and then some time later (e.g. in production) noticing the bug (if you’re lucky). You don’t have that facility in OCaml.

Even though, I’m very happy that Coq has typeclasses, and I would be happy that OCaml gets some kind of it too. It’s worth the cost IMHO.

Since we’re talking about OCaml and its lack of ad-hoc polymorphism, which proposals like “modular implicits” are meant to address, in the context of this recent article that talks about “type safety” issues (and I have some professional expertise in what we mean by that nebulous phrase in the C++ community, as a member of committee at my day job that deals with the AUTOSAR standard), I’d like to offer my unpopular opinion here that the problem these discussions are dancing around is basically the Liskov substitution principle.

The whole reason this isn’t as big a problem in Rust as it can be in Haskell is that Rust doesn’t have any subtyping relations (except for with its lifetime kind) and Haskell does. If OCaml were to be expanded with modular implicits, all the trouble associated with subtype relations that show up in Haskell would be replicated in OCaml, along with the additional complication that you’re not going to get canonical coherence in OCaml like you have in Haskell because it’s incompatible with the OCaml module language. The best we’re going to get is the kind of coherence you see in Scala, and sadly, the essay we are talking about here did not discuss how these examples would look in Scala.

I think there is a connection here to the other basic confusion that tends to arise, especially in OCaml programs that use the object language: the problem that subclasses may or may not be subtypes, and it’s really easy to make a subclass that type checks as a subtype, but cannot actually be substituted with any other instance of its supertype. That’s the issue this article is surfacing in Haskell, and the fact that the only way to discuss that issue in either Rust or Haskell is to show it with traits and type classes is, I believe, leading to some distractions here.

2 Likes

I’m not following, OCaml has plenty of subtyping with objects and polymorphic variants. I thought the one core issue really was coherence of instances?

Translation please? Presumably “coherence” is a term or art?

Coherence is when you can statically determine a single, unique instance per type class and type.

1 Like

And if you don’t have coherence then type inference can fail, requiring type annotations – correct?

If you don’t have coherence then ideally the typechecker should fail the build telling you that it can’t identify a unique instance, and list the instances it found so you can debug.

EDIT: I will add that type classes/modular implicits don’t really make sense in idiomatic OCaml (at least to me). E.g. here’s one of the motivating examples in the paper:

module type Show = sig
  type t
  val show : t -> string
end

let show {S : Show} x = S.show x

implicit module Show_int = struct
  type t = int
  let show x = string_of_int x
end

But…we don’t program like this in OCaml now. Instead, we pack all the members needed into the original module that contains the type. Or we create a new expanded module that includes the original module and expands on it. The point is we pack in all the members needed for a module to conform to all the interfaces we need it to. E.g.,

module Int = struct
  type t = int

  let show = string_of_int
  (* ...other members... *)
end

Now Int automatically conforms to Show, and with its other members, to any of a bunch of other interfaces. To me this is one of the basic principles of modular programming in OCaml, and trying to move towards a more Haskell style where we split up and parcel out a single module’s functionality into many small modules to make them ‘instances’…seems like an unnatural style. OCaml modules already support subtyping and conformance. Haskell type class instances don’t work in that way. The design spaces are different and imho that’s OK.

7 Likes