Beginner's question about optimizing a float -> float function with Owl

jbe · January 13, 2026, 11:47pm

Hi all, I am still a beginner with OCaml (I have looked into it some time ago but then not revisited it for a long time). I got curious about Owl for mathematical calculations.

I find it hard to get along with all the different types, but I’m doing my best.

As an exercise, I created the following short program, which tries to find the minimum of a function of type float -> float, here x^x:

#require "owl";;

let my_func x = x ** x;;

let pack_flt = Owl_algodiff.D.pack_flt;;
let unpack_flt = Owl_algodiff.D.unpack_flt;; 

let my_func_wrapped x = x |> unpack_flt |> my_func |> pack_flt;;  

let opt_params = Owl_optimise.D.Params.config 100.;;

let result = Owl_optimise.D.minimise_fun opt_params my_func_wrapped (pack_flt 0.567) |> snd;;

But I don’t seem to get the right result (which would be 1/e = 0.36787944117144):

$ utop optim.ml
--- Training config
    epochs         : 10
    batch          : sample of 100
    method         : gradient descent
    loss           : cross_entropy
    learning rate  : constant 0.001
    regularisation : none
    momentum       : none
    clipping       : none
    stopping       : none
    checkpoint     : none
    verbosity      : true
---
2026-01-14 02:53:10.482 INFO : T: 00s | E: 1.0/10 | B: 1/10 | L: 0.724906[-]
2026-01-14 02:53:10.482 INFO : T: 00s | E: 2.0/10 | B: 2/10 | L: 0.724906[-]
2026-01-14 02:53:10.482 INFO : T: 00s | E: 3.0/10 | B: 3/10 | L: 0.724906[-]
2026-01-14 02:53:10.482 INFO : T: 00s | E: 4.0/10 | B: 4/10 | L: 0.724906[-]
2026-01-14 02:53:10.482 INFO : T: 00s | E: 5.0/10 | B: 5/10 | L: 0.724906[-]
2026-01-14 02:53:10.482 INFO : T: 00s | E: 6.0/10 | B: 6/10 | L: 0.724906[-]
2026-01-14 02:53:10.482 INFO : T: 00s | E: 7.0/10 | B: 7/10 | L: 0.724906[-]
2026-01-14 02:53:10.482 INFO : T: 00s | E: 8.0/10 | B: 8/10 | L: 0.724906[-]
2026-01-14 02:53:10.482 INFO : T: 00s | E: 9.0/10 | B: 9/10 | L: 0.724906[-]
2026-01-14 02:53:10.482 INFO : T: 00s | E: 10.0/10 | B: 10/10 | L: 0.724906[-]
--- Training summary
    Duration: 00s
0.567

What am I doing wrong?

jbe · January 14, 2026, 11:26am

I think I found the problem:

Using Owl_algodiff.D.pack_flt takes a float and converts it into a Owl_algodiff.D.t that resembles a constant. Owl_optimise.D.minimise_fun does not calculate derivatives on its own.

So with the following workaround (still relying on a float -> float function as input), I get the right result:

 #require "owl";;
 
 let my_func x = x ** x;;
 
+let delta = 1e-6;;
+let my_func_diff x =
+  ( my_func (x +. delta) -. my_func (x -. delta) ) /.
+  (delta +. delta);;
+
 let pack_flt = Owl_algodiff.D.pack_flt;;
 let unpack_flt = Owl_algodiff.D.unpack_flt;;
 
-let my_func_wrapped x = x |> unpack_flt |> my_func |> pack_flt;;
+let my_func_wrapped x =
+  let open Owl_algodiff.D.Maths in
+  pack_flt (my_func (unpack_flt x)) +
+  pack_flt (my_func_diff (unpack_flt x)) * (x - (pack_flt (unpack_flt x)));;
 
-let opt_params = Owl_optimise.D.Params.config 10.;;
+let opt_params = Owl_optimise.D.Params.config 10000.;;
 
 let result = Owl_optimise.D.minimise_fun opt_params my_func_wrapped (pack_flt 0.567) |> snd;;
 
 let _ = result |> unpack_flt |> Float.to_string |> print_endline;;

Note that I (ab)use pack_flt (unpack_flt x) to convert x to a constant here.

I also see that Owl provides numerical differentiation. I tried to use that instead of manually specifying a delta, but the code got somewhat messy (though it does work):

#require "owl";;

let my_func x = x ** x;;

let pack_flt = Owl_algodiff.D.pack_flt;;
let unpack_flt = Owl_algodiff.D.unpack_flt;;

let my_func_wrapped x =
  let open Owl_algodiff.D.Maths in
  let module D = Owl_numdiff_generic.Make (Owl.Arr) in
  let wrapped_func x = my_func (Owl.Arr.get x [|0|]) in
  pack_flt (my_func (unpack_flt x)) +
  pack_flt (
    Owl.Arr.get
    (D.grad wrapped_func (Owl.Arr.create [|1|] (unpack_flt x)))
    [|0|]
  ) * (x - (pack_flt (unpack_flt x)));;

let opt_params = Owl_optimise.D.Params.config 10000.;;

let result = Owl_optimise.D.minimise_fun opt_params my_func_wrapped (pack_flt 0.567) |> snd;;

let _ = result |> unpack_flt |> Float.to_string |> print_endline;;

Lots and lots of packing and wrapping and unpacking.

My questions:

What would be the idiomatic way to solve this problem (“I have a float -> float function and want to find the argument where the result is minimized”) using Owl?
I got a bit confused by Owl_algodiff vs Owl.Algodiff. What is the difference? I suspect Owl_algodiff is more the internal module and Owl.Algodiff is the official API? I noticed, for example, that we have Owl.Arr but no Owl_arr.
Am I supposed to use Owl_numdiff_generic in higher-level code at all? I see it belongs to owl-base rather than the owl library.

ghennequin · January 23, 2026, 6:10pm

You need to define your function using the maths operators provided by Algodiff.D.Maths. An idiomatic way of using owl for this problem would be

open Owl
module AD = Algodiff.D (* a common shorthand *)

(* AD.t -> AD.t *)
let my_func x = AD.Maths.(x ** x)

let _, x_opt (* result is AD.F 0.367305 on my machine *)
    let open Optimise.D in
    (* pick some optimiser -- Adam works well for most problems *)
    let learning_rate = Learning_Rate.Adam (0.01, 0.9, 0.99) in
    (* we'll run the optimiser for 100 iterations *)
    let config = Params.config ~learning_rate 100. in
    (* run the optimiser from initial condition 0.5 *)
    minimise_fun config my_func (AD.F 0.5)

The key thing to understand is that simply wrapping a function like so

let my_func_flt x = x **. x
let my_func x = AD.pack_flt (my_func_flt (AD.unpack_flt x))

just won’t work, because the machinery that allows owl to automatically compute gradients lies within the AD.t type and the AD.Maths.* operators. Simply using OCaml’s **. operator won’t allow owl to implement a reverse pass to propagate gradients.

Owl’s automatic differentiation module is brilliant and one of the most ergonomic I know. I would recommend spending some time studying the tutorial on algodiff. You might also be interested in raven which is still under development but promises to be an awesome take on scientific computing in OCaml.

ghennequin · January 23, 2026, 6:26pm

And just to elaborate a bit: there’s no way of automatically lifting a float → float function into a useful AD.t → AD.t function which you can then automatically differentiate through. However, it’s very easy to lower a AD.t → AD.t function into a float → float function; this would look like

let lower f x = AD.unpack_flt (f (AD.pack_flt x))

So I recommend embracing the AD.t type as your main numerical type, and only convert from/to float (or bigarrays in the form of Arr.arr via AD.unpack_arr) at the very periphery of your code (e.g. importing a dataset, or saving results to disk, etc). You might even find it useful to open AD.Maths.

To answer your other qs: Owl’s main user-facing modules are in “front-end” modules Owl.Foo (in your case, Owl.Optimise) even though, for some of them, they are just aliases for the corresponding backend module Owl_foo (e.g. Owl_optimise). The fact that frontend = backend for these modules is just an implementation choice the owl developers have made, but I suppose this could change any time (if the project was still actively developed ;D) so it’s cleaner to use the frontend Owl.***.

Oh, and you should hardly have any need for numdiff (I believe this is mostly used for testing algodiff). Numdiff is approximate (finite-difference) differentiation, whereas algodiff is exact (modulo machine precision).

jbe · January 23, 2026, 8:13pm

Thank you for your example. It looks quite clean and concise.

I was wondering if I have to create the function “symbolically”. I see how this can improve speed and precision, but I wanted to not have too deep ties between Owl and my own code, so I wondered if it can work entirely numerically. This is why I wrote:

I meant the literal OCaml type float -> float, not a function ℝ → ℝ.

But I take that the idiomatic way is to not provide a numerical function but a always symbolical one when working with Owl.

I guess if I use Owl, I’ll need a deeper integration then and do all my math on top of it.

Oh okay, so the Owl_[…] modules are somewhat “exposed backends”. In short: Use Owl.Foo instead. Not every library handles things this way though, I presume?

Yeah, well, I looked into its implementation. It is just a very simple algorithm that doesn’t scale well. It simply uses 0.00001 as an ε value (source). I thought there was some more “magic” to it.

I’ll definitely have a look at raven as well. Thanks!

Topic		Replies	Views
Owl project restructured Ecosystem announce	10	2058	July 21, 2024
[ANN] owl 0.5.0 released Community announce	5	1287	March 8, 2019
[ANN] owl 0.8.0 and 0.9.0 released Community announce	5	1737	July 15, 2021
Owl - Need help with basic usage Learning owl	4	2090	August 4, 2019
GPU accelerated numerical ops on OCaml arrays via pyml + jax? Learning	26	2298	May 25, 2023

Beginner's question about optimizing a float -> float function with Owl

Related topics