Objects use cases in OCaml

@Drup suggested that the following is possibly inferior or more complex than using objects:

module type S = sig
  type t
  val do_thing : t -> t -> t
  val to_string : t -> string
  val other_function : ....
end

type 'a state = ('a * (module S with type t = 'a))

I see his point but as an alternative, I was thinking of doing something like this:


(* A basic, typical module definition. *)
module type S = sig
type t
val get : t -> unit -> bool
end

(* An implementation of [S] which simply returns its associated value. *)
module Id : S with type t = bool = struct
type t = bool
let get t () = t
end

(* In spirit of [S], but with [type t] stripped out of the signature. *)
module type S' = sig val get : unit -> bool end

(* Functor to convert an [S] to [S'], given a factory function for producing values of [type t]. *)
module Make(T : sig include S val create : unit -> t end) : S' =
struct
let t = T.create ()
let get = T.get t
end

(** Example instantiations of [S'] from [S]. *)
module Always_true = Make(struct include Id let create () = true end)
module Always_false = Make(struct include Id let create () = false end)

(* Example of a function aggregating over all the module instances. *)
let for_all l = List.for_all (fun (module T:S') -> T.get ()) l

(** Should return [false]. *)
let result = for_all [(module Always_true);(module Always_false)] 
2 Likes

Late to the game here but I would like to point out that liquidsoap is perhaps the only large-scale OCaml application that I know that heavily relies on objects. Back when it was started, there was no first-class modules. Also, the application relies on an abstract interfaces for streaming sources that fits very well the object paradigm, specially with self recursion and virtual functions, much like exemplified by ivg.

That being said, as a long-time developper in OCaml and in other languages as well, I stay away as much as I can from the object paradigm. Inheritance patterns are confusing, bug prone and always have you going back to the order of operations or how and when underlying classes are instantiated and with which parameters. Module paradigm is much cleaner and readable in my opinion.

3 Likes

Lots of libs and apps at least use objects: js_of_ocaml, BAP, Gtk bindings.

First class modules does not replace objects, besides, they are quite clumsy to use for such things: you have to drag your state and module, it feels so much like GObject.

1 Like

I feel like you’ve made a really important observation. Thank you. Even in C++, O-O is used less and less, in favor of template (parametric, typically) polymorphism and various “interesting” Boost constructs for algebraic datatypes. O-O seems to be used to implement the infrastructure of (e.g.) maps, etc, but not by users of those features.

Which is excellent progress, and I write this as an avid (at times) C++ programmer.

1 Like

Someone needs to benchmark these two approaches. I have a feeling that first class modules would come out ahead here and this is can be far more important than what is syntactically lighter or more conceptually pleasing.

2 Likes

This is definitely true, but it just highlights the need for modular implicits in the language. While FCM have some unique use cases, they’re mostly a poor-man’s version of the dictionaries we want passed implicitly with each type.

As much as I’d like to see modular implicits, I get the impression no one is going to be going near to that work before multicore is merged.

1 Like

off topic for newbie question: for the student example.

module rec S1 : Student = Base(Younger(S1))
let s1 = S1.(say@@create 20);;
module rec S2 : Student = Base(Younger(S2))
let s2 = S2.(say@@create 20);;
  1. how to interpret the ‘module rec’ for self?

the explain I can find is

  1. Is not the S1 and S2 the same?

Disclaimer: I’m not sure I’m actually understanding the question, so the answer could be a little bit off.

When you define a recursive module, whether it is a structure or a functor, the module that is being defined could be used in its own definition. Putting typing aside, this is absolutely the same as with functions and other recursive data types. A structure in OCaml has the same runtime representation as a record, i.e., it is just a named tuple. A functor is, therefore, a function that takes a record and creates a new record. When you define a structure recursively, a record is created, with all fields uninitialized (e.g., all functions are just raising exceptions, basically like in a purely abstract class), and it is our task that at the end all definitions are well-formed, e.g., we can indeed create such bogus definitions as:

module type Show = sig type t val show : t -> string end
module rec I : Show with type t = int = struct include I end

This is basically the same as creating an abstract (pure virtual in OCaml parlance) class and inherit from it, without providing any implementation. However, unlike with classes, where such definition are forbidden and captured during compile time as errors, with modules it is silently accepted and only in runtime we will have an error:

# I.show 42;;
Exception:
File "//toplevel//", line 1, characters 40-46: Undefined recursive module.
Raised at file "camlinternalMod.ml", line 36, characters 33-65
Called from file "toplevel/toploop.ml", line 180, characters 17-56

It depends on your definition of “the same”. They have the same module type, they have the same definition, and from the point of view of Leibniz equality, they are the same. However, they are different objects and basically, they correspond to two different pieces of code. In other words, they are the same as the following functions f1 and f2,

let f1 x y = x + y
let f2 x y = x + y
2 Likes

Cryptokit has an object interface:

It annoyed me the first time I used it, because I had to learn some
strange syntax that I never use out of cryptokit use cases.

Sometimes, I wonder if the O in OCaml was not added by someone from the marketing department of INRIA (if there is any). :smiley:

I don’t know why Xavier [urk and DIDIER (Remy)!] &co did it. But I’m glad they did. Marketing counts for something; maybe it counts for a lot. But also: even a lapsed cleric of the Church of Intensional Type Theory (“extensional? Never touch the stuff! It’s sequents all the way down!”) can appreciate that sometimes, you don’t have the time, or the knowledge, or the patience, to craft the precise types you need to solve a problem. In those moments, the O-O features come in really handy.

Yes, it’s pretty fricken’ rare. I can remember only one instance (back in '01) when I needed 'em. but notwithstanding, it would have been a PITA without objects. And sure/sure/sure, I’m completely willing to agree that we could just use first-class modules (or whatever – I’ve never even used a first-class module); but like GADTs, that’s more stuff I have to learn about Ocaml’s type system, to solve my problem.

What I’m trying to say is: it’s excellent to force people to learn a bit about type theory, a bit about Ocaml’s type system, to solve their problems. Force them to learn too much, and you’re where Haskell is (IM(NS)HO).

It’s a good thing that Ocaml can be used by talented systems-jocks who only want to learn just enough type theory to get their job done. Which job is completely unrelated to anything that most PL folks would even recognize.

3 Likes

Oh, I forgot: The Thrift Ocaml API (the API used by the generated marshallers, and the generated marshallers themselves) are heavily, heavily dependent on Ocaml’s objects.

It’s probably true that one could get rid of this dependency with a complete restructuring of the Thrift-Ocaml runtime library, as well as the “emitter”/“protocol compiler” (a big mass of old C++ code … blecccch). But who would want to do it, and why? People have more valuable things they can spend their time on.

And. It. Works. Marvelously, as a matter of fact.

2 Likes

In 2010 I was porting Mark Hayden’s Ensemble group-communications library (ocaml + C) to run on Infiniband. At the bottom was some C code that managed “buffers” using manual reference-counting (the rationale is in chapter 4 of his PhD thesis, and it is compelling). I wanted to “swivel” Ensemble to run on buffers that were managed by the Infiniband card (hence, hardware buffers, hence, for sure a finite resource, not manageable by the Ocaml GC). I found it -infinitely- easier to do this swiveling by introducing an object interface down at the bottom of Ensemble for the “manager of buffers” than trying to figure out how to functorize the whole of Ensemble.

At the end of the day, I wasn’t doing the work merely to get a “port” of Ensemble that worked on Infiniband; I wanted to build a distributed system using Ensemble, and needed it to run on Infiniband for peformance reasons. Every hour spent hacking on Ensemble was an hour I wasn’t working on my distributed system.

ETA: Ensemble is a -big- system. Not feasible to move it into a single functor. It wasn’t clear at the time (and probably not clear today) if functorizing all the various bits, and then knitting them together with functor-applications, would have allowed the native-code compiler to do its magic. Since microseconds matter when it comes to Infiniband, it wasn’t a matter of “let’s do the work and see what happens”, but rather “let’s pick a path that guarantees we’ll get the performance we expect”.

By which I mean that: yes, ML’s functors are wonderful stuff. But when we use 'em, we’re counting on the compiler to deliver for us; when it doesn’t, life can suck. Being able to introduce a single O-O interface in a single not commonly-used path is … lovely.

3 Likes

At Be Sport, we use them as an output type of SQL results instead of tuples in PGOcaml.
It comes handy when you don’t want to define a type because there is only one occurence where this type is used and you want a struct-style type.

We basically use them as anonymous records.

6 Likes

I used the object subsystem for a few things in Orsetto.

  • The Cf_uri module uses it for representing the diamond inheritance hierarchy that includes relative and absolute URI.
  • The Cf_stdtime module uses it for representing the inheritance hierarchy of time values that may be qualified with time zone offsets.
  • The Cf_encode and Cf_decode modules define Cf_encode.emitter and Cf_decode.scanner class types with various private implementations for different kinds of input and output. There is also a Cf_encode.framer subtype of the emitter class type, that provides additional low-level methods.

In each of these cases, I chose to use the object system because a functionally equivalent interface using the module subsystem would have been unwieldy. The object subsystem is nice when you need it.

2 Likes

One interesting use of objects is in the compiler itself: objects are used in the native-code backend to share code across architectures. X. Leroy wrote an interesting set of slides where objects and modules are compared and this particular example is discussed:

https://xavierleroy.org/talks/icfp99.ps.gz

(I have a pdf version here: https://gist.github.com/nojb/da0e294f800d5401b6746ce6dccd05fc)

3 Likes

FYI, the PDF is sideways, which makes it hard to read. (It also turns out to be hard to download a raw github gist!)

You just need to click on the button that reads “Raw” on the upper right corner.

I quite enjoyed the the slides. From start to elitist/populist divide.

Here are the slides with landscape orientation

1 Like