The following is heavily caveat’ed with test, test and test preferably using something like core_bench.
While testing code I was trying to extract as much performance as possible from using ocaml-4.0.5 I made the following discoveries:
- Complex functions defined within function scope are less optimised than those defined outside
- Within function scope small functions that call a function using variables in the outer scope have no impact on optimisation. So,
let foo x = let bar y = baz x y in... is fine
- Partial function application isn’t well optimised, so
let f x y = ... in let g = f 1 in g 2. is slower than
f 1 2
- Cross file optimisation is performed, especially in-lining. However, dragons be here! If the external module is overly complex the optimiser can give up even if the functions themselves are simple
- Monads combined with functors (a common idiom) cost around 15% even for the identity monad (see later).
- Closures are optimised out if possible, unfortunately this isn’t always: Monads seem to fit in this category
- Imperative code using
while and an FSM for example is slower than the functional version (no real surprise here)
Monad / functor example. Given,
module type IO = sig
type 'a t
val return : 'a -> 'a t
val (>>=) : 'a t -> ('a -> 'b t) -> 'b t
module Foo (IO : IO) = struct
let add1 x = x >>= fun y -> return (y + 1)
module IO = struct
type 'a t = 'a
let return x = x
let (>>=) x f = f x
module Bar = Foo(IO);;
The bind (>>=) adds about 15% overhead which stubbornly refused to be optimised no matter how I cut the code.
This is my experience with one library so reader beware! Also, if you are really looking for smoking performance head over to flambda.