Understanding cancellation (in eio)

@orbitz that sounds interesting, can you include a link to your work?

Intuitively cancellation (as I understand it from the Eio docs and the discussion here) is related to the scheduler, not completely independent: it is the scheduler that keeps track of the structural relations between fibres (which ones must be cancelled together), so cancellation information comes from a fibre to another fibres through the scheduler. From a fibre perspective, “I want to suspend” is information it gives to the scheduler, but “I’m being cancelled” is information coming from the scheduler.

Maybe it’s possible to keep track of the “fibre structure” (switches, structured concurrency) outside the scheduler, to handle cancellation separately? Is this what your approach does?

Note: I’m asking about cancellation from the perspective of an agnostic Suspend primitive. I wonder if its interface must take cancellation into account, or whether it can be handled separately. (In Eio Suspend talks about cancellation, but in a rather roundabout way in my opinion). Looking at Eio code, there seems to be two different use-cases for Suspend:

  • High-level, synchronous usage: when you wait on a programmatic event that will occur in another domain or on another fibre; for example “I wait until someone writes a value to this channel”. You suspend to wait on “something else”, and there isn’t anything special cancellation-wise.
  • Low-level, asynchronous usage: when you create an asynchronous operation by interacting with a low-level interface / the operating system (or libuv, etc.). This is the flavor of the code examples that were shared above in this thread. There we want to Suspend ourselves to give control back to the scheduler, but we also want to handle cancellation specifically (not just to cleanup the resources in our continuation) to cancel this asynchronous operation.

It’s not clear to me whether this low-level, asynchronous usage should be implemented “inside the scheduler” or “in user code”. In any case, this requires communication with the scheduler. In the Eio codebase, some operations are implemented in the same effect handler as the scheduler, while some operations are implemented “outside” the scheduler, but using low-level / private APIs that break the abstraction barrier.

I think it should be possible to implement “low-level asynchronous cancellation” purely on the user side but, depending on the requirements, this probably requires a more complex interaction protocol (with the scheduler) than just Suspend. But maybe it’s better to leave it to the scheduler to avoid having to expose this protocol. Or maybe a nice design would be a two-layer system, with a “high-level scheduling interface” (with built-in cancellation) implemented in terms of a “low-level scheduling interface” that implements cancellation outside the (low-level) scheduler.

Finally: I’m not sure if the Eio code right now corresponds to a two-layer system where the two layers where manually smashed/inlined for performance reasons (having a single handler), or whether the boundaries are unclear (to me at least) because the library grew organically and some things should be structured better.