Hi all, I am still a beginner with OCaml (I have looked into it some time ago but then not revisited it for a long time). I got curious about Owl for mathematical calculations.
I find it hard to get along with all the different types, but I’m doing my best.
As an exercise, I created the following short program, which tries to find the minimum of a function of type float -> float, here x^x:
#require "owl";;
let my_func x = x ** x;;
let pack_flt = Owl_algodiff.D.pack_flt;;
let unpack_flt = Owl_algodiff.D.unpack_flt;;
let my_func_wrapped x = x |> unpack_flt |> my_func |> pack_flt;;
let opt_params = Owl_optimise.D.Params.config 100.;;
let result = Owl_optimise.D.minimise_fun opt_params my_func_wrapped (pack_flt 0.567) |> snd;;
But I don’t seem to get the right result (which would be 1/e = 0.36787944117144):
Using Owl_algodiff.D.pack_flt takes a float and converts it into a Owl_algodiff.D.t that resembles a constant. Owl_optimise.D.minimise_fun does not calculate derivatives on its own.
So with the following workaround (still relying on a float -> float function as input), I get the right result:
#require "owl";;
let my_func x = x ** x;;
+let delta = 1e-6;;
+let my_func_diff x =
+ ( my_func (x +. delta) -. my_func (x -. delta) ) /.
+ (delta +. delta);;
+
let pack_flt = Owl_algodiff.D.pack_flt;;
let unpack_flt = Owl_algodiff.D.unpack_flt;;
-let my_func_wrapped x = x |> unpack_flt |> my_func |> pack_flt;;
+let my_func_wrapped x =
+ let open Owl_algodiff.D.Maths in
+ pack_flt (my_func (unpack_flt x)) +
+ pack_flt (my_func_diff (unpack_flt x)) * (x - (pack_flt (unpack_flt x)));;
-let opt_params = Owl_optimise.D.Params.config 10.;;
+let opt_params = Owl_optimise.D.Params.config 10000.;;
let result = Owl_optimise.D.minimise_fun opt_params my_func_wrapped (pack_flt 0.567) |> snd;;
let _ = result |> unpack_flt |> Float.to_string |> print_endline;;
Note that I (ab)use pack_flt (unpack_flt x) to convert x to a constant here.
I also see that Owl provides numerical differentiation. I tried to use that instead of manually specifying a delta, but the code got somewhat messy (though it does work):
#require "owl";;
let my_func x = x ** x;;
let pack_flt = Owl_algodiff.D.pack_flt;;
let unpack_flt = Owl_algodiff.D.unpack_flt;;
let my_func_wrapped x =
let open Owl_algodiff.D.Maths in
let module D = Owl_numdiff_generic.Make (Owl.Arr) in
let wrapped_func x = my_func (Owl.Arr.get x [|0|]) in
pack_flt (my_func (unpack_flt x)) +
pack_flt (
Owl.Arr.get
(D.grad wrapped_func (Owl.Arr.create [|1|] (unpack_flt x)))
[|0|]
) * (x - (pack_flt (unpack_flt x)));;
let opt_params = Owl_optimise.D.Params.config 10000.;;
let result = Owl_optimise.D.minimise_fun opt_params my_func_wrapped (pack_flt 0.567) |> snd;;
let _ = result |> unpack_flt |> Float.to_string |> print_endline;;
Lots and lots of packing and wrapping and unpacking.
My questions:
What would be the idiomatic way to solve this problem (“I have a float -> float function and want to find the argument where the result is minimized”) using Owl?
I got a bit confused by Owl_algodiff vs Owl.Algodiff. What is the difference? I suspect Owl_algodiff is more the internal module and Owl.Algodiff is the official API? I noticed, for example, that we have Owl.Arr but no Owl_arr.
Am I supposed to use Owl_numdiff_generic in higher-level code at all? I see it belongs to owl-base rather than the owl library.
You need to define your function using the maths operators provided by Algodiff.D.Maths. An idiomatic way of using owl for this problem would be
open Owl
module AD = Algodiff.D (* a common shorthand *)
(* AD.t -> AD.t *)
let my_func x = AD.Maths.(x ** x)
let _, x_opt (* result is AD.F 0.367305 on my machine *)
let open Optimise.D in
(* pick some optimiser -- Adam works well for most problems *)
let learning_rate = Learning_Rate.Adam (0.01, 0.9, 0.99) in
(* we'll run the optimiser for 100 iterations *)
let config = Params.config ~learning_rate 100. in
(* run the optimiser from initial condition 0.5 *)
minimise_fun config my_func (AD.F 0.5)
The key thing to understand is that simply wrapping a function like so
let my_func_flt x = x **. x
let my_func x = AD.pack_flt (my_func_flt (AD.unpack_flt x))
just won’t work, because the machinery that allows owl to automatically compute gradients lies within the AD.t type and the AD.Maths.* operators. Simply using OCaml’s **. operator won’t allow owl to implement a reverse pass to propagate gradients.
Owl’s automatic differentiation module is brilliant and one of the most ergonomic I know. I would recommend spending some time studying the tutorial on algodiff. You might also be interested in raven which is still under development but promises to be an awesome take on scientific computing in OCaml.
And just to elaborate a bit: there’s no way of automatically lifting a float → float function into a useful AD.t → AD.t function which you can then automatically differentiate through. However, it’s very easy to lower a AD.t → AD.t function into a float → float function; this would look like
let lower f x = AD.unpack_flt (f (AD.pack_flt x))
So I recommend embracing the AD.t type as your main numerical type, and only convert from/to float (or bigarrays in the form of Arr.arr via AD.unpack_arr) at the very periphery of your code (e.g. importing a dataset, or saving results to disk, etc). You might even find it useful to open AD.Maths.
To answer your other qs: Owl’s main user-facing modules are in “front-end” modules Owl.Foo (in your case, Owl.Optimise) even though, for some of them, they are just aliases for the corresponding backend module Owl_foo (e.g. Owl_optimise). The fact that frontend = backend for these modules is just an implementation choice the owl developers have made, but I suppose this could change any time (if the project was still actively developed ;D) so it’s cleaner to use the frontend Owl.***.
Oh, and you should hardly have any need for numdiff (I believe this is mostly used for testing algodiff). Numdiff is approximate (finite-difference) differentiation, whereas algodiff is exact (modulo machine precision).
Thank you for your example. It looks quite clean and concise.
I was wondering if I have to create the function “symbolically”. I see how this can improve speed and precision, but I wanted to not have too deep ties between Owl and my own code, so I wondered if it can work entirely numerically. This is why I wrote:
I meant the literal OCaml type float -> float, not a function ℝ → ℝ.
But I take that the idiomatic way is to not provide a numerical function but a always symbolical one when working with Owl.
I guess if I use Owl, I’ll need a deeper integration then and do all my math on top of it.
Oh okay, so the Owl_[…] modules are somewhat “exposed backends”. In short: Use Owl.Foo instead. Not every library handles things this way though, I presume?
Yeah, well, I looked into its implementation. It is just a very simple algorithm that doesn’t scale well. It simply uses 0.00001 as an ε value (source). I thought there was some more “magic” to it.
I’ll definitely have a look at raven as well. Thanks!