Add support for stack allocation

olleharstedt · January 4, 2021, 8:49pm

Great talk, but funny enough not a word about allocation strategy.

timjs · January 10, 2021, 4:15pm

I really like the idea of the unboxed types presented and enjoyed watching Stephen’s talk. One thing that puzzels me, is how the GC distinguishes between something of kind value and something of kind int64.

My understanding is that the GC traverses the whole stack and distinguishes values from immediates by its last bit. But how does the GC know something of kind int64 is not a GC pointer?

Is the idea to add bit masks to stack frames? (I think GHC does this if I’m not mistaken.) Does OCaml use stack frames at all?

silene · January 10, 2021, 4:31pm

If the value is boxed on the heap, then the box itself tells the GC not to scan its content.

If the value is unboxed on the stack, then there is a large table that, for every return address of the program (i.e., every function call), indicates how to interpret every word of the stack. So, when the GC runs, it visits the whole program stack, jumping from one return address to the other, marking the stack values as GC roots (or not), depending on the content of this table.

timjs · January 10, 2021, 5:30pm

Great! Thanks @silene, this helps my understanding and triggers another question.

Do I understand correctly that the table you’re referring to, is the addition this proposal makes to keep track of unboxed types and current OCaml only uses pointer tagging? If otherwise: why bother with both mechanisms?

silene · January 10, 2021, 5:49pm

I do not understand your question. The table I am describing has existed forever (or at least for 25 years).

timjs · January 10, 2021, 5:57pm

Ah, sorry that I’m unclear. Let me formulate it differently. We have a stack and we’d like to know which words are pointers and which are integers. If there is a table describing which words on the stack are pointers, and which are not, why does OCaml also tag integers to distinguish them? It seems to me as both approaches have the same goal.

silene · January 10, 2021, 6:09pm

All the values in memory form a directed graph. The GC starts from some roots and follow all the edges of the graph until it has accounted all the nodes. Pointers on the stack are roots (if the table says so) and point to values on the heap. Tagging integers is for the sake of heap values; it tells the GC whether a word found in a heap block is a pointer or not.

Note that no word (be it on the stack or on the heap) ever point to a block on the stack, hence this whole discussion: What would be needed to be able to allocate some blocks on the stack?

EduardoRFS · January 10, 2021, 6:10pm

Because you don’t want the GC to keep doing lookup on the table, checking a single bit is fast, doing a lookup not nearly as much.

gadmm · January 11, 2021, 1:07pm

The deep reason for a uniform representation is that it allows expressive parametric polymorphism with a simple implementation. The Achilles’ heel of a kind-based memory representation is lack of kind polymorphism, unless you go into compile-time monomorphisation (templates, with its own set of drawbacks) or just-in-time compilation. A fortiori, this allows you to have types inhabited by a mix of pointers and immediates (e.g. the current representation of the option), though this is not the only way to achieve this.

Topic		Replies	Views
A language with non-escaping stack allocations and regions Community language-design	14	1892	April 4, 2024
Compile a language to C with OCaml GC support Learning compiler	11	1531	January 18, 2021
Basic questions about allocation / minimizing allocations Learning gc	3	356	July 29, 2024
Relaxed rules for binding a C library? Learning multicore	8	719	October 25, 2022
Signals and Threads on Memory Management Community	10	2219	January 23, 2022

Add support for stack allocation

Related topics