even though OCaml is using less than 1 GB of memory when fork is called and there is plenty of RAM and swap space available. The same program runs just fine on MacOS on my laptop.
I’m running OCaml 4.05.0 on the Ubuntu machine and OCaml 4.06.0 on the laptop.
The most likely causes on a server environment are:
* the RLIMIT_NPROC soft resource limit (set via setrlimit(2)), which limits
the number of processes and threads for a real user ID, was reached;
* the PID limit (pids.max) imposed by the cgroup "process number" (PIDs)
controller was reached.
ulimit -a|grep process will give you a hint about the first. The second… cat /proc/$$/cgroups maybe? If it’s another user that’s unable to fork, check those of the other user. You could also try something like perl -le 'print fork' (should print two numbers, one of them 0) to test forking independently of OCaml.
I have the same problem even after raising the limits with:
ulimit -n 65535
ulimit -s 32768
ulimit -l 327680
I have 64Gb of the memory and OCaml program hardly uses only 2. I use Lwt and Lwt_preemptive. Can it be that Lwt_preemptive has some bug in this respect?
I don’t mean running the debugger at the OCaml level. Find out what was passed to fork down at the syscall level and what fork returned, especially if it returned an error code in errno.
clone isn’t the same as fork. This is probably a different issue. How many threads do you have running in this process? You may be running out of task structures.
It might be the same message from OCaml, but it’s not the same to the kernel.
Can you produce a minimized example of this that fails on Linux (as few lines as possible to reproduce the behavior) and I can try to debug it from there on my own machine?
We used to have a problem with Core+Lwt+Fork which as far as I remember was connected somehow with the recording of backtraces, so that at some threads backtraces were recorded, while at others they want, and this led to the heap corruption, with segfaults/ooms. If I remember it correctly, the workaround was either enable or disable backtraces explicitly using environment variables and/or the Printexc module. Hope this helps, can’t remember more, it was 10 years ago or so.