The mysterious pointer in the runtime closure representation

The runtime representation of closures/functions described in the book Developing Applications with Objective Caml seem to be outdated. In particular, sometimes there is a mysterious pointer
in the block which is neither the main code pointer nor part of the environment. For example, inspecting the runtime representation of f defined by

let f x1 x2 x3 x4 x5 x6 x7 = x1 + x2 + x3 + x4 + x5 + x6 + x7

shows that it is a block

OCaml object: 0X000055F084946FD0
    is a block with header: 0X0000000000000FF7
    with number of fields : 3
                    color : 3
                      tag : 247
    (field 0) code pointer: 0X000055F084908200
      (field 1) code info : 0X0700000000000007
                    arity : 7
               env offset : 3
        (field 2) raw hex : 0X000055F084908E90

The env offset 3 means that environment starts from field index 3 but the block size (not counting the header) 3 means that the maximum field index is 2. Therefore this closure has no environment. Field 0 is the code pointer; field 1 is info (arity, env) about the closure, but what does field 2 refer to?

If I remember correctly, one of the two pointers (presumably the second one) is used whenever the closure is fully applied, that is, the number of arguments is equal to the arity field. Using the first one would work too, but it would cause numerous pointless allocations to store all the partial applications.

1 Like

As @silene said, the first pointer is the code pointer to use for a generic application to 1 argument (which is necessary to build partial applications), and the second pointer is the code pointer to use for full applications. This second pointer is not strictly necessary to execute code, but it is a crucial piece to have efficient evaluation (and avoid creating a lot of partial applications when fully applying a function).

Also, as an optimisation, for functions of arity 1, the two code pointers are the same, so the second one is discarded, and only 2 fields are used for the function.

You can see some other examples of closure representation in this graph: https://raw.githubusercontent.com/Gbury/ocaml-memgraph/master/examples/closure/example.svg

I knew I had something written somewhere, and it took me a while to find it, but you can take a look at this document : ocaml-memgraph/closures.md at master · Gbury/ocaml-memgraph · GitHub

I wrote this doc very late one night so it might contain some small errors, but it should overall be correct, or at least a good introduction for how the representation of sets of closures works.

3 Likes

The 4.12.+ macro for accessing the arity from the info field of a closure casts the type of the field to intnat, which is a signed integer. Does it imply there will be some sort of negative arity, or it is just an arbitrary and unimportant design decision?

Negative arities are indeed sometimes used, as far as I know, only in some cases for functions that takes a tuple as only argument (e.g. let f (x, y) = ...), and it has been the case since way before 4.12. That being said:

  • the only use of the arity field is during an application to decide whether an application is total. Since at runtime you can only pass a positive number of arguments, any negative arity will only result in the application never being seen as total (since the negative arity can never be equal to the number of arguments given to the application).
  • in some cases, even tupled function may not have negative arities, for instance, afaik flambda never generates functions with a negative arity.
1 Like

Thanks for your replies @silene @zozozo ! Due to project interest I need to understand the runtime value representation. Now I released a little library for displaying runtime values in textual form: OInspect.