Maybe this is a question on compilers for functional languages in general, not specific to OCaml.
Consider a nested tuple of type, say: let abc: (A * (B * C)) = (a, (b, c)).
In the usual compilation, its memory representation will contain two levels of indirections: one for the first pair and another for the second pair.
However, if I wrote the code using type triple = type triple = {a: A; b: B; c: C} in the first place, I could have saved one level of indirection. (for simplicity, let us assume that the subterm (b, c) was used only as a subterm of abc, and not used elsewhere)
I wonder if an optimization that translates the former to the latter is discussed/implemented somewhere? (I have checked this, but I am not sure if they are sufficient to cover all cases - e.g., interprocedural cases)
And if there is such optimization, I wonder if they cover recursive types? For example, ideally, it could be possible to optimize a linked list in the usual functional programming into a vector-like representation (with some assumptions on its API/representation independence).
Sorry, but I don’t understand the question. The type triple is a sum type, I don’t see how you could rewrite code using the nested tuple type with it. Can you add some more detail?
There are at least two main difficulties. The first one is specific to OCaml, in that its runtime does not allow pointers to point inside data structures. So, a call like snd abc would create an invalid pointer which would break havoc in the garbage collector (among other things). Therefore, the compiler would have to perform a global pass to check that a function such as snd is never called on a value of that type.
And this brings us to the second major issue: polymorphism. If the language allows for polymorphism, then static typing is no longer sufficient to detect such improper uses of snd. Note also that there is a slight variation on the issue of polymorphism. Consider the type (A * B) * C instead. The snd function now needs to return the third field of the flattened value. But without type information at runtime, it would end up returning the second field, which is not the second component anymore. (A workaround would be to disallow runtime polymorphism and thus to monomorphize the whole program at compile time.)