How to "block" in an agnostic way?

I’m confused at to how Lazy.force can detect cycles:

# let rec x = lazy (Lazy.force x) in Lazy.force x ;;
Exception: CamlinternalLazy.Undefined.

The lazy state could remember the current domain that started the evaluation, so that other domains would Suspend but further forcing from the same domain would raise.

But when a lazy is suspended during its evaluation, it’s less clear that it is a mistake to force it again from the same domain:

let rec x = lazy (Fibre.yield () ;           Lazy.force y (* or 42, no cycle then! *))
and     y = lazy (               Lazy.force x )
in
Fibre.both (fun () -> Lazy.force x) (fun () -> Lazy.force y)

Assuming my ascii art ordering is correct, I don’t think that y should crash on Lazy.force x since the cycle has not been established yet. (and the suspicious Fibre.yield () can be replaced by forcing a lazy that is being evaluted on another domain, since that would perform a similar Suspend)

Perhaps it’s fine to deadlock the fibres in that situation? Maybe the Fibre_context.t or the wakeup callback can be used to identify the fibres and hence the cycle? Or am I missing something obvious?

No, there is nothing obvious here. If you want to identify cycles in lazy computations and abort in this case, you need not only an agnostic way to block, but an agnostic approach to “identity”, whatever that is. As you remark, detecting cycles by using domain identity would also report cases in non-circular cases of concurrent scheduling.

We discussed this (and other aspects of lazy thunks) at length in Discussing the design of Lazy under Multicore · Issue #750 · ocaml-multicore/ocaml-multicore · GitHub. The main take-away I think is that this aspect was already broken in OCaml 4 in presence of threads (you would also get Undefined, specified at the time as detecting cycles, when threads would race forcing a value), so it’s okay to do something not-so-good in this area.

3 Likes

Fascinating discussion but I think applications and popularity needs to be discussed too.

IME concurrent usage of lazy in purely functional data structures (including lazy streams) is practically non-existent. However, when I’ve seen such data structures used their ease of use has been key so you’d want least surprise (i.e. concurrency safe) in that context if at all possible.

I’ve actually replaced my use of traditional purely functional data structures (particularly Map) with imperative collections wrapped with an undo buffer and finally a lock. Hash tables are so much faster than trees. I wonder how that might be implemented in multicore OCaml.

By far the most common usage of lazy in a concurrent environment that I’ve seen is concurrency-safe initialization, e.g. lazy globals. However, that is arguably as a workaround for hard-to-debug type initialization errors: a problem OCaml doesn’t have.

That sounds crazy. Does it actually work? Do you have an example?

Did you try so called Hash Array Mapped Tries? The performance should be in between traditional balanced trees and Hash-based collections. I’m curious about exact numbers…

That sounds crazy. Does it actually work? Do you have an example?

Yes, it works extremely well. I can port it to OCaml if you’re interested.

Yeah that would be awesome. Is the outer interface completely functional then?

Yes, I have tried everything. :slight_smile:

I don’t like hash tries much. I don’t think they represent a great trade-off. They are a bit faster than balanced binary trees (BBTs) but nowhere near a hash table. Note that hash tries are artificially relatively fast on the JVM (i.e. Clojure) because the JVM compare int hashes really quickly but struggles with polymorphic comparisons. On the other hand hash tries aren’t ordered in a useful way like BBTs. The only big advantage of hash tries over BBTs is that many consecutive insertions can be coalesced into mutations, e.g. for construction. I don’t find that very useful.

Another approach that I think is really interesting is hash consed Patricia trees. As they are history independent any given set or map is physically equal if it is semantically equal, e.g. {1,2,3}=={1,2,3} regardless of how they were constructed. So you get maximal sharing. I’d like to build a language that hash conses everything just to see how well it works. Some applications (e.g. Antimorov derivatives) stand to benefit enormously from such an approach. It also turns and recursive descent parser into a packrat parser for free.

Completely functional API, yes. The only real caveat (besides the collection being unordered, of course) is that it only works well for “single lineage” use. In other words, if you keep operating on different old versions it has to repeatedly undo changes and then redo changes which is asymptotically slower. Realistically, you just “clone” a collection when you’ll be doing that.

Incidentally, where is a good place to post OCaml code these days? I stopped using Github when they changed authentication because I never figured out how to work the new approach. Is there an OCaml snippets site I can just post code on?

Hmm… That’s not very different from just using a plain Hashtbl though, right? I mean, sure, you can use an older version, but only if you’re careful…

Most stuff is still on github. MS is awesome nowadays so I have no problem with it. Gitlab is another option.

The semantics are the same as Set or Map but the performance is different. So you only need to be careful if you want it to run quickly. You don’t need to be careful for safety reasons as you must with a hash table.

1 Like

Note: I think that “persistent interface over mutable hashtables” is a rather different discussion from “How to block in an agnostic way”, and Discuss doesn’t support threading so well, so I would encourage you to create a new thread to discuss this.

4 Likes

My current understanding of cancellation, thanks to other thread ( Understanding cancellation (in eio) ) is that it only comes into play if we want to use Suspend to expose low-level asynchronous operations, but not in the case where we use Suspend to wait on a computation happening somewhere else. I think that the latter is a very common use-case for “agnostic blocking”, and it can use your simpler Suspend effect. (Even a cancellation-aware scheduler can use the simple Suspend in this case.)

If we want to later provide a scheduler-agnostic way to implement low-level asynchronous operations with cancellation, we would possibly want a second effect with a different interface, but the users of the first effect would need no change.

1 Like

No, I just picked a bad example! Low-level operations can use a private effect and do whatever they like, but agnostic blocking needs cancellation too. Examples we have so far:

  • Moving Eio’s Promise, Stream and Semaphore to stdlib and using them with other schedulers. All of these need to support cancellation.

  • Communicating with domainslib. We should certainly be able to stop waiting for a domainslib Chan.t, for example. Presumably domainslib itself will support cancellation at some point, and in that case we should let it know about the cancellation too.

The kind of operations that might not need to support cancellation are things like unlinking a file, where the OS might provide no way to stop the operation once it has been submitted. In that case you just have to wait for it to complete.

When we cancel a task/fibre, how quickly do we want to let the synchronization structures know? For example, if a task that is currently blocked on an MVar is cancelled, do we want the MVar to be able to skip over it, and when? Does it remove the task from the MVar at the point of cancellation (strict), or lazily when the MVar attempts to resume the task. For example, the lazy semantics can be obtained by changing the interface of the scheduler to

(* Returns [true] if the task is successfully resumed. [false] if the task was cancelled. *) 
type 'a resumer = 'a -> bool 
type _ eff += Suspend : ('a resumer -> unit) -> 'a eff

Unclear how to do strict removal. If the underlying structures are implemented using lock-free libraries, then the design of the interface should be expressive enough to not restrict where the linearization point will be.

This paper looks quite interesting for this discussion: https://arxiv.org/pdf/2111.12682.pdf.

1 Like

I think “allowing users the freedom to develop their own concurrency libraries” is a big mistake because it can lead to users actually developing their own concurrency libraries. Imagine for a moment what it would be like if we had two incompatible async libraries?

2 Likes

The semantics issue that @talex5 is pointing out, I think, is the following: there is a difference between suspending on shared result and suspending on a unique resource.

  • If I suspend to wait until a lazy computation is available to all its waiters, cancellation is just a question of performance (if my suspension callback gets called but I was invalidated, the scheduler is going to try to invoke me later only to find that it is useless).
  • But if I suspend until a resource is available that I’m planning to consume, for example “take the next Stream element and give it to me”, then the producer I’m suspending onto needs to learn that I was cancelled for correctess reason, not just efficiency. The resource they produced needs to be given to another waiter, or kept, if I was cancelled.

(Again we see a transactional expectation: either the resource is consumed and I run, or I was cancelled and no resource is wasted.)

How does that work? A long time might pass between the moment the resumer function is called and the moment the execution is actually resumed, which means that the task might be cancelled during that period. This opens three choices:

  1. The boolean is only an approximation, for optimization purpose.
  2. The resumer function does not return until the task has actually resumed.
  3. Tasks cannot be cancelled if they are already waiting to be resumed.

I feel like options 2 and 3 are bad ideas, which only leaves option 1. Is there some other option I could have missed?

I believe that (3) is the only sensible option here, at least if you want to provide transactional guarantees (from the suspender side: either the resource we were waiting for was produced, and we run, or cancellation happened and the resource was not produced / not wasted.) I tried to discuss this in Understanding cancellation (in eio) - #16 by gasche

1 Like

I don’t see why this is a problem. We may expect to reason about cancellation using linearizability. The cancellation of a task may take effect before the MVar put or get operation by that task. In this case, the MVar operation by the task should not be matched. Or the cancellation occurs after the MVar operation is matched. In this case, the operation will be added to the scheduler queue to resume it. This case will be handled no differently than handling the cancellation of a task which is currently in the scheduler queue.

We may even ascribe weaker semantics for cancellation that the cancelled tasks are eventually reaped. In this case, the pending MVar operation of a cancelled task may be matched. This semantics is weaker than linearizability and may potentially lead to better performance.

1 Like