I have a question about a corner of the OCaml 5 design for ephemerons.
When an ephemeron key or value is read, e.g. using Weak.get, it might be that the current state of the GC is that marking has finished but sweeping has not, and the read ephemeron is not marked as live, but it has not yet been emptied. The implementation checks this situation and empties the ephemeron before returning “empty” from the get operation.
My question is about the potential alternative choice of not emptying the ephemeron and returning the value that is still there. If the value were returned, it would need to be marked live, but the other path through get calls caml_darken anyhow so I guess that would not be a problem. So I wonder if there are other motivations for the current choice? I ask because in common uses it would be better for user code to receive the existing value, instead of needing to recompute it.
(I make no claim this is a big deal, I’m just curious and couldn’t work out the motivation from the code.)
(I’m trying to page in the details, and may be off. With that caveat…)
but the other path through get calls caml_darken anyhow so I guess that would not be a problem.
It would be a problem to add marking work through caml_darken after the marking phase is done. As with OCaml 4, OCaml 5 gives linearisable semantics to ephemeron cleaning
In order to respect the semantic of the ephemerons concerning dead values, the getter and setter must work as if the cleaning of all the ephemerons have been done at once.
In OCaml 5, as in OCaml 4, the reachability of ephemeron keys and values are decided at the end of the mark phase. The reachability does not change after the decision has been made at the end of the mark phase, though the cleaning may happen later.
In ephe_get_field, note that clean_field is called unconditionally, and caml_darken is behind the conditional check elt != caml_ephe_none. Hence, caml_darken would only trigger if elt != caml_ephe_none. In the sweep phase, caml_darken, when called, is a noop since elt must be reachable (hence, MARKED), since it is not caml_ephe_none. No mark work is added here. caml_darken is only needed in the mark phase since ephe_get_field makes a strong reference out of a weak one. And it is ok to add mark work here.
Thanks KC. I missed that the current call to caml_darken cannot add marking work in the sweep phase, and it makes sense to me that adding marking work at that point would be bad.
I also did not appreciate how strong the semantics of ephemeron cleaning that the runtime aims for really is. Linearizable cleaning of all the ephemerons at once is more than I expected, which was linearizable cleaning of each ephemeron one-by-one. I’d be curious to look at anything if you recall that there was code that needed this, or any other considerations. (I’m just curious, so only if you recall or happen to have pointers to previous discussions.)
Getting linearisable semantics for all ephemerons at once is the natural model since the choice can be made at the end of the mark phase. There is a global barrier at the end of mark to sweep transition where all the domains are stopped. Also, given that ephemeons are mutable, this choice also seems sensible. The strong semantics makes it easier to grok the expectations for implementing ephemerons.
@bobot may be able to say something from the user perspective.