Is it possible to mmap a regular OCaml array to a file?

I know it is supported by bigarrays, but unfortunately Bigarrays only suport numerical values.

More info: the array would be completely filled prior to being mmaped to file.
Consumers would then only read this array and the array would
never be modified once it has been created.

Use Marshall?

Marshal would force users to unmarshal the array prior to being able to read it, so that’s not what I want.

mmap relies on a constant address. Regular arrays live in the OCaml heap where they can be moved by the garbage collector, so that does not seem possible to directly use a regular array for mapped files.

1 Like

Would it be somewhat possible to “GC-freeze” (ask the GC to not move it until further notice) an array?
Get the memory address and length where it is, then tell another OCaml process
to mmap that adress as a regular array of the correct type (that would be accessed read-only).

I should probably use the ancient library, but this library needs to be revived for recent OCaml versions (it is no more maintained).

No, I don’t think that’s possible to do that.

I’m a bit unclear what your use case for this is. You want a regular OCaml array containing non-numerical (so, presumably, some OCaml data structure?) elements to be represented directly in a file. If you’re not using the Marshal module, how do you intend for the contents of the file to be interpreted? The array elements would be pointers, which obviously aren’t meaningful between different processes.

3 Likes

You are right, which means the answer to my question is “no, this is impossible or useless”.

You can probably build something to do what you want using the same methods that allowed construction of the Bigarray extension, but it won’t be quite as clean as just mmapping native arrays.

For my current use-case, I could also write serialisers to/from an int bigarray.
I’ll try this when I have some time.
Though this is way less generic than what I initialy intended to do…

This is the point of the ancient library. But it comes with a lot of restrictions. If your data is reasonably fixed sized (no arbitrary length list for instance) that’s really the simplest solution. If you can bound the size of your value a quick and dirty version is effectively marshaling to a char bigarray.

1 Like

But ancient is no more maintained.
And, to maintain it, you need to know GC internals I suspect.