Proposal
For a long time OCaml 32bits is not exactly well maintained from a community point of view and that happens mostly because the memory model is different.
Couldn’t we have the 64bits memory model running on 32bits platforms? By that I mean, on 32bits platforms the word size is now 64bit. I understand that it will imply some performance loss and double memory usage, especially with multicore. But that would mean zero support required by the community.
Why not an RFC?
I just want to discuss the idea, and writing a proper RFC takes a LOT of time.
Performance
And I believe there is some ways to mitigate the performance problems. And even improve it for 64bits platforms.
Multicore Locking
As 32bits platforms may not have atomic writes to 64bits values, locking may be required to mutable fields under multicore, this could probably be mitigated by having a flag in the compiler --no-multicore
which would enable a GIL.
Int31.t
Because 32bits platforms generally have fewer registers(looking at you i386) we could provide an Int31
module which on 32bits platforms will provide information to keep basic integer operations fast and use a single register if more than 32bits is not required.
This is also the case for pointers in 32bits platforms by extracting the data from the typer we could use a single register.
Int31array.t
Similar to Floatarray, a new kind of runtime tag, where every field uses only 32bits, it can then be used to describe Int31array.t.
On 64bits platforms a block with this tag will still always endup having a true size being a multiple of 64bits as that is required to keep memory alignment. So size 1 = 64bits, size 2 = 64bits, size 3 = 128bits. This is not required in 32bits platforms, but may be desirable for FFI reasons and avoiding a couple #ifdef in the codebase.
On 32bits platforms we can have another float array
like optimization, where int31 array
will automatically be packed, I would say if floatarray optimization is enabled then this can also be enabled for 64bits platforms without performance loss.
This tag can also be used for records { x: int31, y: int31 }
will use just 64bits. Can also be done for 64bits platforms, where this record will only use a single word.
On 32bits platforms any record containing pointers can be treated as 32bits so that ptr array
or { a: ptr, b: ptr }
will be packed, this may not be desirable as it will break the FFI and the goal here is full compatibility. But it can probably be provided as a compiler flag to recover even more of the performance lost.
float32 and Float32array.t
If we provided a float32 type this could be used for more efficient memory usage on data structures on both 64 and 32bits platforms.
It behaves identically to float(but 32bits), but when it’s possible to be packed it will be packed as a Float32array.t
instead of Floatarray.t
, also the same for records, { x: float32, y: float32 }
uses only 64bits of memory.
Another possible optimization would be to contain a tag for 32bits box, so that it uses half memory on 32bits platforms, on 64bits platforms it needs to use 64bits to be compatible with the memory model but it opens rooms for further optimizations. It may not be desirable for FFI reasons but may be interesting to have under a compiler flag.
Implementation
Yes, I understand that it is a lot of work. But most of it seems like it can be done without disrupting the OCaml core development. And in small incremental steps.