Representing statistical distributions with first class modules


#1

I’m trying to find a way of representing an open set of statistical distributions and have been playing around with first-class modules following the approach outlined here.

My module looks like:

module type Dist = sig
  type param
  type repr 
  type t 
  
  val create : repr -> param
  
  val sample : Random.State.t -> param -> t 
  
en

let sampleDist (type a b) rng (module D: Dist with type t = a and type param = b) (param:b) = 
  D.sample rng param

For convenience, I’m then wrapping it in another module pairing a particular value of param with the first class Dist moduel (again, following the linked example), giving me:

module type Dist_instance = sig
    module Dist : Dist
    val this : Dist.param
end

However, I can no longer write a sample function that will work with this Dist_instance module:

let sample (type a) rng (module D : Dist_instance with type D.Dist.t = a) =
        D.Dist.sample rng D.this

How would I expose or reference the return type t of the Dist module within the Dist_instance module?

Many thanks,

Michael


#2

The root issue is that when you define the inner module

module Dist: Dist

you are hiding any type equalities between the type repr, param, t defined by the module Dist and the outside world. Thus you can never construct any values of type t from a module of type Dist_instance.
To avoid this issue, you need to lift those types inside the Dist_instance module and use them to make the signature constraint on the submodule Dist much less opaque

module type Dist_instance  = sig
   type param
   type repr
   type t
   module Dist: Dist with type param = param and type repr = repr and type t = t
   val this: Dist.param
end
let sample (type sample) rng (module D: Dist_instance with type t = sample) =
  D.Dist.sample rng D.this

Note however, that once you start to have this or self value inside packed module, this is a sign that you are starting to reimplement classes and objects. Thus, you could define a distribution as

class type ['sample] distribution = object 
  method sample: Random.State.t -> 'sample
  method expectation: 'sample
  method variance: 'sample
 end
let sample state x = x#sample state
class exp lambda = object(_:'self)
  constraint 'self = float #distribution
  ...
end

And if needed you could define a type for distribution builder:

type ('param,'sample, 'd) gen = 'param -> ('sample #distribution as 'd)

#3

Thank you! I did wonder if classes would make more sense…

The linked section of Real World OCaml seems to suggest that this is a realistic use case of first class modules:

With this signature, we can create a first-class module that encompasses both an instance of the query and the matching operations for working with that query.

Is there a compelling reason to use them (vs. classes) in that instance?

Thanks again,

Michael


#4

I would say it is a matter of preference and context between first-class modules and classes. However, if you are not already using first-class modules, starting with objects rather than emulating them with first-class modules is probably simpler.