I would like to create random floats following a Gaussian (normal) distribution with μ=0 and σ=1.
I’m using the Box-Muller transform, and this is my source code:
let rec gaussian_floats : unit -> float Seq.t =
let pi_2 = 2.0 *. Float.pi in
fun () ->
let rng = Random.split () in
let rec next () =
let a = sqrt (-2.0 *. log (Random.State.float rng 1.)) in
let b = Random.State.float rng pi_2 in
Seq.Cons (a *. cos b, fun () -> Seq.Cons (a *. sin b, next))
in
next
let () =
gaussian_floats () |> Seq.take 10
|> Seq.map string_of_float |> Seq.iter print_endline
Following the specifications of Random.float this program could create infinite floats, because:
Random.float boundreturns a random floating-point number between 0 andbound(inclusive).
and
# sqrt (-2.0 *. log 0.);;
- : float = infinity
This is something I don’t want to happen (even if the probability is low).
I could do my own if-clause to handle that case, but it’s an extra branch. If the bounds were excluding the upper limit, I could do 1.0 - Random.float 1. Another alternative that even works when both bounds are inclusive is: min_float +. Random.float 1. (because min_float +. 1. = 1.).
But these are extra calculations I would want to avoid (even if they don’t hurt a lot).
Looking into the stdlib, I see that there is already an if-clause to exclude results of zero (assuming the upper bound is not extremely small):
Putting the same if-clause in my own code seems to be redundant, but the specification in the manual forces me to do that.
Also, the upper limit seems to be exclusive, following the current implementation. For any finite positive number x, I would expect Random.float x < x:
# Int64.to_float (Int64.shift_right_logical (-1L) 11)
*. 0x1.p-53 *. max_float < max_float;;
- : bool = true
But this isn’t guaranteed by the specification either.
What should I do? I feel like this is a problem other people may deal with as well.
- Should the if-clause be removed from the stdlib (because it’s unnecessary)?
- Could the manual be refined to specify a tighter bound when the upper bound is non-zero and greater than the smallest subnormal number?
Consider:
# 1.0 *. 0x1.p-53 *. (0x1.p-1022);;
- : float = 0.
# 1.0 *. 0x1.p-53 *. (0x1.p-1022 +. 0x1.p-1074);;
- : float = 4.94065645841e-324
Not sure if depending on rounding modes, the limit may be slightly different, but in either case it is so low (lower than 10 to the power of minus 307) that for most application cases, we could safely assume that a random float will never be zero (if the manual would be refined accordingly).
Or am I overlooking some detail here, or is there a good reason why these corner cases should not be specified, e.g. to allow future changes of the implementation? In the latter case, maybe I should implement my own version of Random.float?