It might be worth stating for some readers that those 20% are not a pure waste of resources. Indeed, even without a GC memory management is not free.
Modular implicits, multicore and algebraic effects, Flambda, possibly more core OCaml developers thanks to recent announces, lots of other improvements (e.g. better error messages from the compiler), “Bob”, and then this very work… it really seems there is a great window of opportunity in the next few years to make OCaml an even greater language than it is today. I hope all this good work will find its way into OCaml in the near future. The last steps for integration into the compiler are, I guess, often the hardest ones, I hope concerned people will not be discouraged by this endeavor…
By generative, I mean that a computation will not have any observable effects (besides possible an allocation). I think it’s irrelevant since the compiler will never see values as it operates on different abstractions. So there shouldn’t be any problems in the middle end. Even in backend I wouldn’t expect any problems, besides maybe ABI implementations.
What I’m thinking is that some primitives that relly on the OCaml value runtime representation might become unimplementable in the new representation, e.g.,
equal, etc. Let’s take the structural equality for example, so far it is using the lowest bit to distinguish between blocks and immediate. If the bit is set to
1 then it doesn’t follow the pointer. So it is not obvious to me how to implement polymorphic compare with the proposed representation (other than making them non-comparable, like functions, for example).
The proposal considers adding an allocation method based on RAII in addition to the GC, similar for this purpose to Fluet, Morrisett and Ahmed’s linear regions (2006) (not to be confused with Tofte-Talpin regions). It has pros and cons. It is mostly suitable for long-lived memory allocations, for large or mutation-intensive values (because it can be used to bypass the write barrier). Bringing actual figures in the discussion is very helpful. Do you have a more precise picture of how and to which extent an allocation method alternative to the GC would have helped?
Don’t hold your breath! Everything remains to be done. There are a few things to figure out still before one can even get started with a prototype. I would prefer if this proposal is seen as a survey of the literature and techniques, plus some connecting-the-dots, meant to show the feasibility and usefulness of such a model and motivate the real hard work.
Thanks for the details.
I have thought about the generic functions, though you are right that it is not mentioned in the proposal. It makes sense to me that linear values should be compared physically. If two resources have different addresses, then they are different. This is similar to how objects are treated. The current implementation of hash, compare, equal, treating them like integers, is likely satisfactory.
Oh sure I get it but some propositions have been worked out for a longer time so they are more advanced, implementation-wise, and hopefully not too far from being apt for integration into OCaml.
Warning: I am not a compiler person, unlike many INRIA folks.
My understanding of region-based allocation is that all calls to
malloc/free could be found at compile time.
I.e. there are no more GC passes trying to find what to keep and what to free.
You allocate memory when you need it and you free it as soon as you don’t need it anymore (I guess you can use a memory pool, not necessarily the system’s malloc/free calls).
I am not sure this is applicable to OCaml though. Experts should know.
I think this would be a major boon to the ecosystem. Rust is gaining a lot of mindshare on the grounds of safety.
Of course there are many angles to consider. Multicore compatibility is a great advantage for this proposal.
I’m not an ocaml expert yet but from those I’ve interacted with, I think most ocaml hackers are well invested in the ability to write correct programs whilst still being productive wrt “real world” projects.
Just my two cents
Sorry for the late reply… The nice retrospective on region-based memory management you provide as your first link does a good job at explaining the drawbacks of region inference, which I understand are commonly considered show-stoppers for any application to OCaml (unpredictability, lack of solution for separate compilation…). Even if one gives up on the idea of region inference, the idea of region (or arena) allocation does not quite fit the proposal, given that a lot of expressiveness comes from the hypothesis that you can move resources. So I do not think that their benchmarks are meaningful for this proposal, given that it does not contain anything as efficient as deallocating a whole arena at once.
Interested people can find slides from my talk from Monday at Inria Paris (updated to take feedback into account).
Ah, you have a right paper in the end…
I’m not going to pretend like I’m an expert here.
Most of the criticisms of region based memory allocation are that it’s prone to space leaks, and most languages don’t have the type system to support it, and the performance boost while significant isn’t worth all of the issues it brings, so like while it should be an add on library, or .
Those two issues, often force implementers to use it along with GC.
A bit late to the party, but a couple of years ago I played around with (ab)using the minor heap to implement something like region-based memory management. I don’t have access to the code anymore, but could probably recreate it if there was any interest?
Don’t ask me. Maybe @gadmm might be interested.
I sometimes care about parallel programs (to crunch numbers faster on multicore computers); I never care about concurrency (Lwt and Async users).
My hunch is that a region-based GC could be useful for parallel programs in OCaml.
I.e. a shared memory region, write-once-read-many, is declared.
A owner process fills it (the only allowed writer for this shm).
Reader processes might read it several times.
Once all readers have finished reading (some synchronization mechanism is necessary somewhere), the owner can reuse the whole region for another iteration.
I.e. this region does not need a regular GC; the goal being low latency for parallel programs
Currently, something like this can be done using a bigarray, but I suspect performance could
be better if things did not need to be marshalled/unmarshalled to/from a bigarray. The parmap library uses this bigarray trick.
I.e. this “shared memory GC” needs to be able to write any regular OCaml value in there, and there need a functionality to copy-and-read out of it (for readers).
This being said, there are some functionalities like this in the ocamlnet library.
Also, currently the scalability of ocaml programs using parmap or parany is quite good.
That would just shave a few cycles more, and make parallelism scale better for finer grain computations.
Hack_parallel kind of fits your use case, though like the bigarray trick, there’s a whole lot of (de)serialization going on. IIRC, the (de)serialization has a significant performance impact, which prompted my experiment with (ab)using the minor heap as a region, as well as integrating something like the Ocamlnet functionality instead of marshalling the values.
I’ve chatted to @gadmm about this (I’ve vaguely been working with them on making off-heap values more first class, though ‘working’ is a stretch for what I’ve actually done).
Hack_parallel is an ugly hack with an even uglier interface.
I’ll stay with my parany library for the moment, since it fills all my parallel needs.
Agreed that it’s not pretty - was mainly referring to the functionality