The "Arity Curtain"

I acknowledge that the notion of arity as the number recorded in a closure’s info field is unusual. Usually, the arity of a function means the number of parameters it can take, which is, implicitly, the maximum number of parameters. However, in the context of an OCaml closure’s info field, the term arity adopts a different meaning. This is best illustrated by the example below.

Let’s first define:

``````let sum_six = fun a b c d e f -> a + b + c + d + e + f
(* partial sums *)
let psum1 a         = sum_six a
let psum2 a b       = sum_six a b
let psum3 a b c     = sum_six a b c
let psum4 a b c d   = sum_six a b c d
let psum5 a b c d e = sum_six a b c d e
``````

Then we ask: what is the arity of `psum1`, …, `psum5`, respectively? Intuitively, we would say their arities are all the same : 6 ! However, if we use a runtime value explorer (I use OInspect) to peek into the blocks of these closures, I get arities 1,…, 5 for them respectively. For instance, `psum1` has runtime repr.

``````OCaml Value : 0X0000555CE6851C90
wo-size : 2
color : 3
closure tag : 247
Field 0 : 0X0000555CE681C170 ... code partial appl.
Field 1 : 0X0100000000000005 ... info
Arity : 1
Env.  : 2
... no env.
``````

`psum2` has runtime repr.

``````OCaml Value : 0X0000555CE6851CA8
wo-size : 3
color : 3
closure tag : 247
Field 0 : 0X0000555CE681BF80 ... code partial appl.
Field 1 : 0X0200000000000007 ... info
Arity : 2
Env.  : 3
Field 2 : 0X0000555CE681C1F0 ... code total appl.
... no env.
``````

and `psum3` has runtime repr.

``````OCaml Value : 0X0000555CE6851CC8
wo-size : 3
color : 3
closure tag : 247
Field 0 : 0X0000555CE681BE90 ... code partial appl.
Field 1 : 0X0300000000000007 ... info
Arity : 3
Env.  : 3
Field 2 : 0X0000555CE681C280 ... code total appl.
... no env.
``````

Using the manual’s terminology, it all seems that for an identifier `<value-name>` (e.g., `psum1`) defined by

``````let <value-name> = fun <parameter>+ -> <expr>
``````

where `<expr>` does not have the form `fun <param>+ -> ...`, the arity (as in the closure’s info field) of `<value-name>` is the number of formal parameters `<parameter>+`; moreover, when `<expr>` itself is a partial application, the number of missing parameters do not count towards the arity of `<value-name>`.

An abstraction (e.g. `fun a -> sum_six a`) is like an “arity curtain” that prevents the compiler from seeing that actually `psum1` can take 6 arguments. I guess this may have something to do with the evaluation model of OCaml, where an abstraction body is not evaluated at definition time.

The runtime representation of `sum_six 1 2 3 4 5` is quite different from that of `psum1 1 2 3 4 5` as well. For `sum_six 1 2 3 4 5`, the block has quite a neat structure: header, code pointer, info, and env that consists of the original function `sum_six` and its five arguments, as follows:

``````OCaml Value : 0X00007F77B626EE78
wo-size : 8
color : 0
closure tag : 247
Field 0 : 0X0000561F3D79E240 ... code partial appl.
Field 1 : 0X0100000000000005 ... info
Arity : 1
Env.  : 2
env starts ...
Field 2 :
....OCaml Value : 0X0000000000000003
.... Is integer : 1 (in decimal)
Field 3 :
....OCaml Value : 0X0000000000000005
.... Is integer : 2 (in decimal)
Field 4 :
....OCaml Value : 0X0000000000000007
.... Is integer : 3 (in decimal)
Field 5 :
....OCaml Value : 0X0000000000000009
.... Is integer : 4 (in decimal)
Field 6 :
....OCaml Value : 0X000000000000000B
.... Is integer : 5 (in decimal)
Field 7 :
....OCaml Value : 0X0000561F3D7D3C68
....              wo-size : 3
....              color : 3
....              closure tag : 247
....    Field 0 : 0X0000561F3D79D800 ... code partial appl.
....    Field 1 : 0X0600000000000007 ... info
....              Arity : 6
....              Env.  : 3
....    Field 2 : 0X0000561F3D79E110 ... code total appl.
....    ... no env.

``````

However, the runtime repr. of `psum1 1 2 3 4 5` is much more involved; it is a cascade of partial applications which basically says `psum1 1 2 3 4 5` is a partial application to 5, of `psum1 1 2 3 4`, which is a partial application to 4, of `psum1 1 2 3`, which is , …, which is a partial application to 1, of `sum_six` :

``````OCaml Value : 0X00007F77B626ED90
wo-size : 4
color : 0
closure tag : 247
Field 0 : 0X0000561F3D79DCD0 ... code partial appl.
Field 1 : 0X0100000000000005 ... info
Arity : 1
Env.  : 2
env starts ...
Field 2 :
....OCaml Value : 0X000000000000000B
.... Is integer : 5 (in decimal)
Field 3 :
....OCaml Value : 0X00007F77B626EDB8
....              wo-size : 5
....              color : 0
....              closure tag : 247
....    Field 0 : 0X0000561F3D79DC80 ... code partial appl.
....    Field 1 : 0X0200000000000007 ... info
....              Arity : 2
....              Env.  : 3
....    Field 2 : 0X0000561F3D79DC50 ... code total appl.
....    env starts ...
....    Field 3 :
........OCaml Value : 0X0000000000000009
........ Is integer : 4 (in decimal)
....    Field 4 :
........OCaml Value : 0X00007F77B626EDE8
........              wo-size : 5
........              color : 0
........              closure tag : 247
........    Field 0 : 0X0000561F3D79DBF0 ... code partial appl.
........    Field 1 : 0X0300000000000007 ... info
........              Arity : 3
........              Env.  : 3
........    Field 2 : 0X0000561F3D79DBC0 ... code total appl.
........    env starts ...
........    Field 3 :
............OCaml Value : 0X0000000000000007
............ Is integer : 3 (in decimal)
........    Field 4 :
............OCaml Value : 0X00007F77B626EE18
............              wo-size : 5
............              color : 0
............              closure tag : 247
............    Field 0 : 0X0000561F3D79DB60 ... code partial appl.
............    Field 1 : 0X0400000000000007 ... info
............              Arity : 4
............              Env.  : 3
............    Field 2 : 0X0000561F3D79DB30 ... code total appl.
............    env starts ...
............    Field 3 :
................OCaml Value : 0X0000000000000005
................ Is integer : 2 (in decimal)
............    Field 4 :
................OCaml Value : 0X00007F77B626EE48
................              wo-size : 5
................              color : 0
................              closure tag : 247
................    Field 0 : 0X0000561F3D79DAD0 ... code partial appl.
................    Field 1 : 0X0500000000000007 ... info
................              Arity : 5
................              Env.  : 3
................    Field 2 : 0X0000561F3D79E190 ... code total appl.
................    env starts ...
................    Field 3 :
....................OCaml Value : 0X0000000000000003
.................... Is integer : 1 (in decimal)
................    Field 4 :
....................OCaml Value : 0X0000561F3D7D3C68
....................              wo-size : 3
....................              color : 3
....................              closure tag : 247
....................    Field 0 : 0X0000561F3D79D800 ... code partial appl.
....................    Field 1 : 0X0600000000000007 ... info
....................              Arity : 6
....................              Env.  : 3
....................    Field 2 : 0X0000561F3D79E110 ... code total appl.
....................    ... no env.

``````

The arity of a closure is the exact number of arguments expected by the underlying code pointer (the second one if there are two of them). It is an implementation detail, and does not necessarily have any link to the source-level function, though obviously in many cases an OCaml function with `n` parameters will be compiled to a single closure of arity `n`.

Here are a few examples to illustrate:

``````(* Hack to get the arity on current versions of the compiler.
Only works in native mode. *)
let arity x = Obj.(obj (field (repr x) 1) asr (Sys.int_size - 8))

let f { contents = x } y = x + y (* Two parameters *)
let () = Printf.printf "Arity of f is %d%!\n" (arity f)
let id x = x
let () = Printf.printf "Arity of id is %d%!\n" (arity id)
let () = id id id id id () (* Applied to 5 arguments *)
let tupled (x, y) = x + y (* Single tupled argument *)
let () = Printf.printf "Arity of tupled is %d%!\n" (arity tupled)
``````

Regarding the representation of `sum_six 1 2 3 4 5` vs `psum1 1 2 3 4 5`, the first one is the result of an optimisation: `sum_six` is known to have arity more than 5, so we can build the partial application closure at compile time with all the parameters in a single closure.
However, in the case of `psum1`, it has arity 1, and will return a function of arity 5 (the partial application of `sum_six` to a single argument), so the compiler will generate an exact application of `psum1` to the first argument `1`, but the remaining 4 arguments will be passed using a generic application of an unknown function; and since this unknown function will end up, at runtime, having an arity (5) different from the number of arguments, so they will be passed one by one using the partial application code pointers. This is what creates the nested closures.
Note that if you use Flambda, there is a good chance that the compiler will manage to generatethe same result for `psum1 1 2 3 4 5` as for `sum_six 1 2 3 4 5`. You may have to increase the optimisation level a bit for the optimisation to trigger fully.

What you should keep in mind when looking at arities, is that all closures with non-1 arity are actually optimisations over the basic representation. In particular, the bytecode compiler doesn’t optimise the representation of closures, always generating closures of arity 1 (and storing 0 in the arity field, which is never read in bytecode).

1 Like