Do we have a pure OCaml, high-performance, persistent key-value store in 2023?

Sorry for more questions,

  • Which Operating Systems (OS) are needed to run the code?
  • Which build system is needed?
  • Which API can be used? [C, C++]
  • What type of software license is needed?

I ask because I (EricGT) too have a very similar need (related SWI-Prolog work) and RocksDB checks most of the boxes but getting it to compile natively for Windows is still a work in progress (vcpkg might be the answer and MSYS2 is also of value) and I hope that for Mac OS it is close enough to Linux to not be a problem, I have never programmed for Mac OS so can only guess for Mac OS.

As choosing a build system for multiple OS and/or external libraries it seems the trend is to use CMake and for any other build system you have to do most of the leg work and may find yourself in a group of one for certain subtasks.

If you want to learn more about RocskDB from a programmer perspective see the RocksDB blog entries, there are lots of unofficial blogs and info on RocksDB but many of those can be misleading, gloss over a needed detail, be out of date, be specific to a particular problem, etc.

While RocksDB does have a C API, the full complement of functionality can not be done through that API, to get the full complement of functionality the C++ API must be used.


I also agree with others that you should not build such a K-V store from scratch on your own, I would not, it would be worse than shooting yourself in the foot, use time tested and production quality code used by many others.


EDIT

For real world numbers using a large real world data set (SemMedDB) with SWI-Prolog and RocksDB as the Key-Value store (ref)

Cumulative writes: 
   339M writes, 
   339M keys, 
   339M commit groups,
   1.0 writes per commit group,
ingest:
   34.14 GB,
   0.70 MB/s

If you are interpreting that 34.14 GB of data was loaded then you are reading it correctly.

1 Like