Using a bigarray as a shared memory for parallel programming

UnixJunkie · December 10, 2019, 4:41am

I am wondering if memory mapped bigarrays are usable as a shared memory in order to do parallel programming.

I am thinking about the write-one-read-many mode of communication, where a writer
process would create a bigarray as a file on disk.
Once this file is created, several reader processes would access it for reading
(only for reading) by mapping it into memory (Unix.map_file).

If I am not mistaken, this would allow to share all types supported by bigarrays between processes, without having to do marshal/unmarshal.

The synchronization between readers and writers would be done outside of the bigarray, using semaphores.

mbouaziz · December 10, 2019, 4:11pm

Your idea reminds me a lot of the shared memory used in Facebook Hack/Flow/Pyre:

bluddy · December 10, 2019, 7:11pm

We really need a library similar to python’s multiprocessing. It has pipes (via serialization), queues, and shared memory support for primitive C types (like bigarray). It’s great and fairly easy to use. And no monads – those should be optional.

We’d probably need to have atomic reference counting in the shared memory area, with finalizers from OCaml code.

samwgoldman · December 10, 2019, 8:28pm

Yes, Bigarray is very suitable for this task. The shared memory module used by Hack/Flow/Pyre does not use Bigarray, but I am actually working on a slightly lower-level shared memory API that does.

Using Bigarray from OCaml is actually a bit nicer than calling into C, since OCaml implements a number of primitives for reading/writing into Bigarray which can be optimized into simple mov instructions, avoiding the need for a call entirely. It’s also nice to be able to write small functions and modules and rely on ocamlopt’s inliner.

For an example of this, take a look at JaneStreet’s Bin_prot library, which makes extensive use of these primitives to implement very efficient serialization code.

samwgoldman · December 10, 2019, 8:35pm

If you are looking to store OCaml values in shared memory directly, then you will run into other issues. A big one is the GC. The runtime will not know what to do if it reaches values outside of the managed heap. You can work around this by ensuring the GC bits of these values are “black,” but you still need to be very careful to avoid pointers from shared memory into the managed heap, as GC can move those values out from under you.

Furthermore, polymorphic operations like compare, hashing, and marshaling will not know what to do with your values and will treat them as opaque pointers. That is, the string value “foo” in shared memory would not be =-comparable with “foo” in the managed heap. String.compare should work.

What’s more, since OCaml’s value representation uses absolute pointers, you need to ensure that those pointers are valid in all processes with a mapping to shared memory. You could do this by making the mappings all at a fixed address, which is kind of yucky, or somehow encoding/decoding the pointers with respect to a base pointer, which feels pretty complicated.

ivg · December 10, 2019, 9:39pm

Certainly they are. Consider looking at Gerd Stolpmann multicore library. It is available as a part of ocamlnet, so you can install it with

 opam install ocamlnet

See also netshm.

UnixJunkie · December 11, 2019, 1:42am

I known ocamlnet. I used it in parany (Netmcore_queue):

I also know quite well parmap, where I added several features and use
it extensively in production.

In fact, I am thinking about writing my second parallel library for OCaml.
And I don’t want to pull in the big ocamlnet dependency this time.
Also, I want finer grain control over locking via semaphores.
And I don’t want the shm to be governed by the GC.

UnixJunkie · December 11, 2019, 1:45am

I think I know this library, and found the interface pretty cumbersome
and the code overly complex.
I think it is available in opam in the package hack_parallel.

UnixJunkie · December 11, 2019, 1:48am

What you are describing looks like ocamlnet.

http://projects.camlcity.org/projects/dl/ocamlnet-4.0.4/doc/html-main/Intro.html#netshm
http://projects.camlcity.org/projects/dl/ocamlnet-4.0.4/doc/html-main/Intro.html#netcamlbox
http://projects.camlcity.org/projects/dl/ocamlnet-4.0.4/doc/html-main/Intro.html#netmulticore

UnixJunkie · December 11, 2019, 2:05am

I don’t want the shm to be managed by the GC.
So, the values that will be OK to put in the shm will only be all the basic
types which are supported by the Bigarray module.
I know parmap does marshal/unmarshal to/from a char bigarray, but that’s not what I want to do.
I want to avoid Marshal.
I might end up with a library in which users have to provide their own read/write functions to/from the Bigarray they have allocated.
But, I expect this to be faster than the Marshal module, because this will mostly
be data copy. It should also be more compact.
So, the library I am considering will maybe not be completely generic.
But, for some use cases it will do the job and should be pretty fast.

bluddy · December 11, 2019, 2:33am

ocamlnet doesn’t exactly fit what I’m talking about. In many ways, ocamlnet is more powerful than what I described, in that it can store OCaml values in shared memory. However, it has to do horrible things to make it work (like overriding C functions in the runtime system) and is quite unsafe.

In my mind, the only safe way to transfer ocaml values between processes is via serialization over pipes, and if you want to use shared memory, you should only use C-types.

UnixJunkie · December 11, 2019, 3:44am

Then, in Parmap there is a hack that consists in exchanging OCaml values via marshalling to/from a char bigarray.

github.com

rdicosmo/parmap/blob/bfe1de1350acd306ffce4a874a69e5463c87304f/bytearray.ml

(***************************************************************************)
(* bytearray.ml : functions for efficient marshaling to and from bigarrays *)
(*                                                                         *)
(* Copyright 1999-2011, Jérôme Vouillon                                    *)
(*                                                                         *)
(*  This library is free software: you can redistribute it and/or modify   *)
(*  it under the terms of the GNU Lesser General Public License as         *)
(*  published by the Free Software Foundation, either version 2 of the     *)
(*  License, or (at your option) any later version.  A special linking     *)
(*  exception to the GNU Lesser General Public License applies to this     *)
(*  library, see the LICENSE file for more information.                    *)
(***************************************************************************)

open Bigarray

type t = (char, int8_unsigned_elt, c_layout) Array1.t

type tf = (float, float64_elt, c_layout) Array1.t

let length = Bigarray.Array1.dim

This file has been truncated. show original

Maybe you can start from that for your use case.

By the way, I also put it in opam as the bytearray package.
Because I thought it could be useful out of parmap.

ivg · December 11, 2019, 1:00pm

Just to clarify, shm in ocamlnet (see the netshm module) is not governed by GC, it is a bigarray underneath the hood.

UnixJunkie · December 17, 2019, 1:29am

From my understanding, it has its own GC.

theblatte · December 18, 2019, 10:35am

The concurrent hashmap that it implements has actually been made into its own library: https://github.com/rvantonder/hack-parallel/

mbacarella · December 18, 2019, 2:52pm

I am thinking about the write-one-read-many mode of communication, where a writer
process would create a bigarray as a file on disk. Once this file is created, several reader processes would access it for reading ( only for reading) by mapping it into memory (Unix.map_file).

To perhaps state the obvious, you don’t need to back it with a disk-based file. You should be able to map /dev/zero, and then you get a shared region without the overhead of ever synchronizing to disk.

It sounds weird, but that’s what glibc does to serve larger sized malloc() calls.

samwgoldman · December 18, 2019, 5:03pm

It seems like this library has a somewhat old version of the shared mem code. The most notable improvement since this snapshot is the addition of an in-place compacting GC for the heap.

Before the new GC, we did something pretty silly: allocating a new temporary heap with malloc sized to fit all live objects, copy everything into the temporary heap, then back into shared mem.

So the old GC can cause vey spiky allocation for large heaps and can run afoul of the OOM killer.

[Ed: @theblatte I just clicked through to your profile and now understand that the above is information you already had :P]

bluddy · December 18, 2019, 5:17pm

Is it possible to update this library with the latest algorithm?

samwgoldman · December 18, 2019, 7:03pm

It should be fairly straightforward to take the code from Flow and update the hack-parallel repo. We don’t maintain that library ourselves, but if I find a spare moment I might do it myself. Maybe during the holidays.

Topic		Replies	Views
Readonly Bigarray Ecosystem	15	1224	August 17, 2022
Memmap a bigarray, then Unix.fork() (parallel initialization of a numerical array) Learning shm , parallel-programming	1	387	March 29, 2023
A parallel and shared memory library based on Hack's implementation Community	7	2568	December 21, 2020
Writing bigarray to file Community	5	1132	July 7, 2023
Is it possible to mmap a regular OCaml array to a file? Learning mmap	12	1542	February 7, 2019

Using a bigarray as a shared memory for parallel programming

Related topics