Caml_thread_tick and signal SIGPREEMPTION

I notice multi-threaded ocaml programs spin up a dedicated thread that runs caml_thread_tick.

/* The tick thread: posts a SIGPREEMPTION signal periodically */

static void * caml_thread_tick(void * arg)
{
  struct timeval timeout;
  sigset_t mask;

  /* Block all signals so that we don't try to execute an OCaml signal handler*/
  sigfillset(&mask);
  pthread_sigmask(SIG_BLOCK, &mask, NULL);
  while(! caml_tick_thread_stop) {
    /* select() seems to be the most efficient way to suspend the
       thread for sub-second intervals */
    timeout.tv_sec = 0;
    timeout.tv_usec = Thread_timeout * 1000;
    select(0, NULL, NULL, NULL, &timeout);
    /* The preemption signal should never cause a callback, so don't
     go through caml_handle_signal(), just record signal delivery via
     caml_record_signal(). */
    caml_record_signal(SIGPREEMPTION);
  }
  return NULL;
}

Is there more information about what this is and why it’s necessary?

You can follow along with what is happening here: https://github.com/ocaml-multicore/ocaml-multicore/pull/381

I’d like to revive this question, since I have the same question.

If we have two system threads, why isn’t it enough to leave preemption up to the operating system? Why do we need the tick thread as well? I’m not an expert, but I’d assume that the OS would be able to preempt a long-running computation on its own.

Context around my question: I have been doing performance analysis of some code using Lwt for cooperative concurrency. However, the workloads don’t perform IO very frequently, and in practice we are not good at inserting enough manual yield points. As a result, it’s very difficult to keep latencies under control. Preemption is quite reasonable for these workloads, so I wanted to explore using threads for concurrency instead of Lwt, but I am confused by the tick thread.

1 Like

The tick thread does not have much to do with preemption in that sense. OCaml’s garbage collector currently requires all the threads of the process to cooperate to avoid concurrent accesses. In particular, if a thread is blocked on a system call, then the whole process is stuck until this thread can take part in the current collection. To avoid this issue, the tick thread can perform the collection in place of the blocked thread.

1 Like

The tick thread (one per domain) forces a call to yield() every 50ms. In OCaml 4 via a signal and in OCaml 5 via a more civilized atomic flag requested_external_interrupt in the domain state. (This is different from the backup thread from OCaml 5.)

Preemption with threads has had fairness problems, as witness by various issues filed on this topic. This problem with threads was also mentioned to me by Coq developers.

The OS cannot directly preempt because the preempted thread must release the domain lock and this must occur at a safe GC point. Implementing yield with OS primitives also proved problematic. IIUC in the current implementation nothing is there to ensure fairness.

I imagine both fairness issues and the 50ms interval could be a problem if your constraint is latency.

  • If your analysis shows fairness problems, then it is better to file an issue.
  • For the 50ms delay you could imagine running your own asynchronous callback to call yield regularly. For running an asynchronous callback regularly, memprof callbacks have proved very reliable, as they are called according to actual work done rather than clock time (e.g. you could have a version of memprof-limits that calls yield, instead of interrupting the thread). Let me know of any experiment in this area.
1 Like