In defense of OCaml objects

toots · February 29, 2024, 4:37am

The case for objects in OCaml: duck typed values

Personally, I really dislike the object paradigm in programming and believe it’s bound to disappear.

It’s an over-engineered paradigm that was introduced in a very early stage of abstract programming languages and, more and more, programmers and engineers are learning to go for leaner abstractions that are easier to reason about.

To me, this all boils down to inheritance, which is a bad paradigm and a great footgun. Not being able to easily track down what version of a given method will be executed is really bad for reasoning about your code. At least that’s my opinion.

The object model in OCaml, in that regard, is not the most popular aspect of the language. However, I have learned to love using them for a specific type of use.

The reason being that objects are typed by what they do and not what they are named. This is perhaps one of the few, if not the only part of the OCaml typing system that is like that. Even records are nominatives (see below for a discussion about that).

In type theory, I believe that this is the difference between duck typing and nominative typing.

Quick example

Take an example: let’s say that your program has two implementations of timestamps.

One that is using the Unix API: timestamps are floating point numbers and you use Unix functions to sleep, regular arithmetic to manipulate them etc.
One that is using posix’s timespec.

You can then define a generic type for time intervals:

type t =
  < implementation : string
  ; now : t
  ; sleep_until : unit
  ; of_float : float -> t
  ; to_float : float
  ; add : t -> t
  ; subtract : t -> t
  ; multiply : t -> t
  ; lt : t -> bool
  ; lte : t -> bool >

And have the application pick whatever implementation is available:

val implementations : (string, t) Hashtbl.t

Then, each API can implement its own object:

let unix x : t =
  object (self)
    val x = x
    method implementation = "builtin (low-precision)"
    method now = {<x = Unix.gettimeofday ()>}

    method sleep_until =
      let delay = x -. self#now#to_float in
      if 0. < delay then (
        try Thread.delay delay
        with Unix.Unix_error (Unix.EINTR, _, _) -> self#sleep_until)

    method of_float x = {<x>}
    method to_float = x
    method add x' = {<x = x +. x'#to_float>}
    method subtract x' = {<x = x -. x'#to_float>}
    method multiply x' = {<x = x *. x'#to_float>}
    method lt x' = x < x'#to_float
    method lte x' = x <= x'#to_float
  end

The posix version is sligthly larger and left out for conciseness. However, it simply implements the same object type.

This is very versatile and much more convenient to use dynamically at runtime compared to navigating the complexities of first-class modules, functors and etc.

What about records?

It is also worth noting that records can technically achieve the same type of API. However, records are still nominative so, you need to declare a common interface that all users of the record type have to refer to.

Being able to simply declare “I want an object that I can call with this and that” is also much more flexible and reduces inter-dependencis between your source files.

This is particularly useful to break recursive dependencies. Say that you want to define a type of values that are called source, that each source needs to be attached to a clock and that each clock can return a list of sources attached to them.

Add to this that the implementation for source and clock is pretty big and you want to have them in separate files…

Under these assumptions, it becomes pretty complicated to setup your source code tree to support this. However, with an object type you can simply do:

clock.ml:

type source = < id: string; ... >
type clock = {
  sources: source Queue.t;
  ...
}

let create () = ...
let attach clock source =
  Queue.push clock.sources source

source.ml:

class source id =
  let clock = Clock.create () in
  object(self)
    initializer
      Clock.attach clock (self :> Clock.source)

    method id = id
    method clock = clock
...
  end

You’ve immediately broken up your cyclic dependencies: clocks simply need to declare what they want sources to implement and do not care about how they are named…

n4323 · February 29, 2024, 8:50am

It would be great to see code examples for the record alternative in the first example, in order to more explicitly compare the two approaches.

I have been using records to pass parameters of some long-running simulation loop:

let params = object 
  method speed = 4.  
  method time_step = 0.01
  method total_time = 3. (* etc... *) 
end
let rec loop params x t = 
  if t >= params#total_time then x else
    let xdx = x *. params#speed *. params#time_step in
    loop params (x +. xdx) (t +. params#time_step)

The perceived benefit is that I don’t have to define a record type for the parameters (in this case listing them all as float), which just feels like cumbersome boilerplate. At the same time I fear a slowdown due to slower method access.
An alternative would be do define a one-off module Params. However, if I want to pass that to functions as a first-class module instead of keeping it as a global ‘variable’, then I need to make the signature explicit, which is even more boilerplate.

toots · February 29, 2024, 3:22pm

Good point on the benchmark.

Here’s a quick implementation for both case:

let object_timestamp x =
  object
    val x = x
    method to_float = x
    method multiply x' = {<x = x *. x'#to_float>}
  end

type 'a timestamp = {
  value: 'a;
  multiply: 'a timestamp -> 'a timestamp -> 'a timestamp
}

let record_timestamp value =
  let multiply timestamp v = { timestamp with value = timestamp.value *. v.value } in
  { value; multiply }

Indeed, the benchmarks are pretty radically different:

toots · February 29, 2024, 4:06pm

Looking more into it, records also have issues with keeping a private reference. In order to have an implementation that is not using objects and can abstract away the actual implementation without having issues with types escaping their scope, all I could find was using an extensible type:

type t = ..

module type T = sig
  val implementation : string
  val time : unit -> t
  val sleep_until : t -> unit
  val of_float : float -> t
  val to_float : t -> float
  val ( |+| ) : t -> t -> t
  val ( |-| ) : t -> t -> t
  val ( |*| ) : t -> t -> t
  val ( |<| ) : t -> t -> bool
  val ( |<=| ) : t -> t -> bool
end

type implementation = (module T)

val unix : implementation
val implementations : (string, implementation) Hashtbl.t

With:

type t = ..

module type T = sig
  val implementation : string
  val time : unit -> t
  val sleep_until : t -> unit
  val of_float : float -> t
  val to_float : t -> float
  val ( |+| ) : t -> t -> t
  val ( |-| ) : t -> t -> t
  val ( |*| ) : t -> t -> t
  val ( |<| ) : t -> t -> bool
  val ( |<=| ) : t -> t -> bool
end

module Unix = struct
  type t += Unix of float

  let unix_time = function Unix f -> f | _ -> assert false
  let implementation = "builtin (low-precision)"
  let time () = Unix (Unix.gettimeofday ())
  let of_float x = Unix x
  let to_float x = unix_time x
  let ( |+| ) x y = Unix (unix_time x +. unix_time y)
  let ( |-| ) x y = Unix (unix_time x -. unix_time y)
  let ( |*| ) x y = Unix (unix_time x *. unix_time y)
  let ( |<| ) x y = unix_time x < unix_time y
  let ( |<=| ) x y = unix_time x <= unix_time y

  let rec sleep_until t =
    let delay = unix_time t -. Unix.gettimeofday () in
    if 0. < delay then (
      try Thread.delay delay
      with Unix.Unix_error (Unix.EINTR, _, _) -> sleep_until t)
end

type implementation = (module T)

let unix : implementation = (module Unix)
let implementations : (string, implementation) Hashtbl.t = Hashtbl.create 2
let () = Hashtbl.add implementations "ocaml" unix

and:

open Posix_time2
open Posix_time2.Timespec

module Sys_time = struct
  type Liq_time.t += Posix of Timespec.t

  let posix_time = function Posix t -> t | _ -> assert false
  let implementation = "native (high-precision)"
  let time () = Posix (clock_gettime `Monotonic)

  let of_float d =
    let tv_sec = Int64.of_float d in
    let tv_nsec = Int64.of_float ((d -. floor d) *. 1_000_000_000.) in
    Posix (Timespec.create tv_sec tv_nsec)

  let to_float t =
    let { tv_sec; tv_nsec } = posix_time t in
    Int64.to_float tv_sec +. (Int64.to_float tv_nsec /. 1_000_000_000.)

  let normalize ~tv_sec ~tv_nsec =
    let tv_sec = Int64.add tv_sec (Int64.div tv_nsec 1_000_000_000L) in
    let tv_nsec = Int64.rem tv_nsec 1_000_000_000L in
    Posix (Timespec.create tv_sec tv_nsec)

  let apply fn x y =
    let x = posix_time x in
    let y = posix_time y in
    normalize ~tv_sec:(fn x.tv_sec y.tv_sec) ~tv_nsec:(fn x.tv_nsec y.tv_nsec)

  let ( |+| ) = apply Int64.add
  let ( |-| ) = apply Int64.sub

  let ( |*| ) x y =
    let x = posix_time x in
    let y = posix_time y in
    normalize
      ~tv_sec:(Int64.mul x.tv_sec y.tv_sec)
      ~tv_nsec:
        (Int64.add
           (Int64.add
              (Int64.mul x.tv_sec y.tv_nsec)
              (Int64.mul x.tv_nsec y.tv_sec))
           (Int64.div (Int64.mul x.tv_nsec y.tv_nsec) 1_000_000_000L))

  let ( |<| ) x y =
    let x = posix_time x in
    let y = posix_time y in
    if Int64.equal x.tv_sec y.tv_sec then x.tv_nsec < y.tv_nsec
    else x.tv_sec < y.tv_sec

  let ( |<=| ) x y =
    let x = posix_time x in
    let y = posix_time y in
    if Int64.equal x.tv_sec y.tv_sec then x.tv_nsec <= y.tv_nsec
    else x.tv_sec <= y.tv_sec

  let rec sleep_until t =
    if t |<=| time () then ()
    else (
      try
        clock_nanosleep
          ~clock:(if Sys.os_type = "Unix" then `Monotonic else `Realtime)
          ~absolute:true (posix_time t)
      with
        | Unix.Unix_error (Unix.EINTR, _, _) -> sleep_until t
        | Unix.Unix_error (Unix.EINVAL, _, _) -> ())
end

let posix_time : (module Liq_time.T) = (module Sys_time)
let () = Hashtbl.add Liq_time.implementations "posix" posix_time

This can then be used the same way that objects were being used:

        let time_implementation = time_implementation () in
        let module Time = (val time_implementation : Liq_time.T) in
        let max_latency = Time.of_float conf_max_latency#get in
        let log_delay = Time.of_float conf_log_delay#get in

mbarbin · February 29, 2024, 6:51pm

The open recursion scheme used by ppx_traverse I would say is also a great case for using objects in OCaml.

With very few judicious inherit statements and method overrides, you can express powerful AST transformers. This approach can lead to concise and readable code, making complex transformations easier to implement and understand.

For the sake of archeologic interest, I’d like to acknowledge some ancient technology by the Mighty Louis, the Sage Valentin, and their contributors. They worked on an ancestor of this kind of thing, written in plain ML:

traverseInterface.ml

github.com

MLstate/opalang/blob/424b369160ce693406cece6ac033d75d85f5df4f/ocamllib/libbase/traverseInterface.ml

(*
    Copyright © 2011 MLstate

    This file is part of Opa.

    Opa is free software: you can redistribute it and/or modify it under the
    terms of the GNU Affero General Public License, version 3, as published by
    the Free Software Foundation.

    Opa is distributed in the hope that it will be useful, but WITHOUT ANY
    WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
    FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for
    more details.

    You should have received a copy of the GNU Affero General Public License
    along with Opa. If not, see <http://www.gnu.org/licenses/>.
*)
(**
   Generic Ast Rewriter Interfaces

This file has been truncated. show original

The comments in this code are quite entertaining. For example

  (**
     The type of your trees, e.g. AST (expr). With up to 3 type variables for parametric trees.
     If you need more than 3 parameters, you can either go to hell, or patch this module (choose
     the less painful)
  *)
  type 'p t constraint 'p = _ * _ * _

And this one

(**
   Traverse AB is a support for an ast defined with 2 mutual recursive types
   {[
   type 'a tA = phi('a tA, 'a tB)
   and 'b tB = xhi('a tA, 'a tB)
   ]}
   The rest is less documented, since it is more or less like the previous TRAVERSE,
   and mostly because it is used only by maniac geeks.
*)

However, I’m willing to bet that you can’t beat the object version (ppx_traverse) here.

Topic		Replies	Views
Objects use cases in OCaml Learning oop , objects	38	8588	June 19, 2019
My Thoughts on OCaml vs Haskell/Rust in 2023 Ecosystem blog	91	25351	September 13, 2023
Philosophy and foundation of ocaml Learning	7	1475	July 15, 2019
OCaml objects under the hood Learning	4	1178	June 12, 2018
Blog post: Writing a Game Boy emulator in OCaml Learning blog	12	3043	January 21, 2022

In defense of OCaml objects

The case for objects in OCaml: duck typed values

Quick example

What about records?

Related topics