[Multicore] [Blog post] A deep dive into Multicore OCaml garbage collector

kayceesrk · July 16, 2017, 10:27pm

Here is the annotated slidedeck from my recent talk on the internals of Multicore OCaml GC. The aim is to provide an overview of the Multicore OCaml GC design with a few deep dives into some of the interesting parts of the GC design. The post has also been discussed elsewhere at Hacker News and reddit (r/programming, r/ocaml). Happy to answer questions!

gasche · July 16, 2017, 11:44pm

Thanks! I read the document earlier in the day and I think it is a truly excellent document – it ramps up slowly enough in term of sophistication, so that people that already know a bit about GC stuff but don’t routinely work with them can refresh their memories on the fly, and it goes into far more details than what I have previously seen (except an in-person talk by Stephen a couple years ago, but many things seem to have changed since then.).

One thing that I did not realize about your design before is the very subtle interaction between the fiber switching machinery and the GC. This aspect is fairly complex and easy to not think of when reading a description of the GC in “normal” functioning (without fiber switches), or when reading descriptions of the user-side of fiber scheduling (ML programming with effects, where it is natural to forget about GC interaction).

kayceesrk · July 17, 2017, 9:31am

Thanks @gasche! My intention was indeed to make the multicore GC design more approachable to non-experts so that we may have informed conversation around the design. Happy that you found the fiber-GC interactions interesting. This is something that is not widely known (GC books don’t talk about concurrency) since only a few languages have them. I’ve seen interesting discussions in golang around GC and goroutines. For example, [0,1]. Some of these are useful choices to consider for multicore OCaml.

[0] https://github.com/golang/proposal/blob/master/design/17503-eliminate-rescan.md
[1] https://github.com/golang/go/issues/19702

bluddy · July 17, 2017, 10:46pm

How does fiber switching and its complexity compare to the costs incurred when doing concurrency the monadic way? Generally speaking, it seems like Lwt/Async are building up large data structures analogous to these stacks, and that they use the default GC scheme, whereas by embedding and specializing the handling of these data structures in the runtime, we should get better performance. Does that sound right?

kayceesrk · July 18, 2017, 4:22pm

Lwt/Async unroll the execution stack into heap data structures, not to dissimilar to CPS translating the code. In particular, what are essentially stack frames in fibers are allocated as intermediate closures in monadic libraries. We do see better performance in this case. See performance results (a bit dated) in http://kcsrk.info/slides/multicore_fb16.pdf.

Topic		Replies	Views
Multicore Update: April 2020, with a preprint paper Community multicore , compiler , multicore-monthly	27	9330	June 5, 2020
About Multicore Ecosystem multicore , compiler , faq	59	13353	April 10, 2018
Multicore OCaml: May 2021 Community multicore , multicore-monthly	3	11323	July 12, 2021
Multicore, Async, and Lwt Ecosystem multicore , lwt , async	17	6663	September 15, 2023
Language abstractions and scheduling techniques for efficient execution of parallel algorithms on multicore hardware Community multicore	19	3693	October 15, 2021

[Multicore] [Blog post] A deep dive into Multicore OCaml garbage collector

Related topics