In my Railroad Tycoon implementation, I have a function call tree that’s quite heavy: I loop over every train, updating each one (position is updated mutably for performance), and some trains will be entering important decision points or stations, so if I want to keep things functional (and I do), I need to return a whole bunch of data structures, including stations, trains, player_info etc. This requires returning large tuples of data structures from many execution paths, which gets somewhat icky, and presents scalability issues. From the compiler’s perspective, is there an advantage to returning these tuples vs returning one record that contains these items, i.e. can ocaml optimize this in any way? Since this loop is called very often, I would prefer for it to be performant. Also, how does flambda treat this matter?
To my understanding at runtime you cannot tell between tuples and records. It is all contiguous region of memory with the block tag.
That’s what I get from playing some time ago with C FFI bindinds. This is a good read on this topic - Representation of OCaml data types
You can try this to convince yourself
type a = int * int;;
type b = { x: int; y : int };;
external conv : a -> b = "%identity";;
let p = conv (2, 2);;
let _ = Printf.printf "{ x = %d; y = %d }\n" p.x p.y;;
Flambda can remove local tuple or record allocations, but is stuck at function boundaries. You can use inlining to remove a few such boundaries, but you will probably not manage to remove all allocations with that.
On the other hand, there are a few promising things on the Jane Street fork:
- An
[@unboxable]
attribute on functions allows Flambda2 to automatically split your functions into a wrapper that takes or returns allocated values, and the main body taking individual parameters and returning multiple results. The wrapper is then inlined whenever possible, removing most of the allocations. - Unboxed tuple and record types are coming (maybe they’re already there actually, I’m not sure). This allows you to return multiple values without allocating memory, without relying on optimisations. This would make your code incompatible with upstream OCaml though.
Also labeled tuples are on their way: First alpha release of OCaml 5.4.0!
Orthogonal to your questions, but have you actually benchmarked your code to gauge whether it needs to be more “performant”?
Cheers,
Nicolas