Buffered IO, bytes vs bigstring

It mostly depends on your allocation policy and the life of your buffers. The question is difficult and it depends on your context but I can list some particularities about both:

  • bigstring/bigarray can not be relocated by the GC. That mostly means that the buffer will never move even if the GC enters into a cycle.
  • Due to the non-relocation of the buffer, we can release the GC lock. This is what happens for digestif which is an implementation of several hash algorithms. We know that these algorithms mostly “calculate”. They don’t do an allocation for instance. So we are able to say that the upcoming computation can be done regardless the GC and in the context of lwt/async (or even multicore), it allows a kind of true parallelism so.
  • Bigarray.sub allocate a “proxy” of the initial bigarray. A sub does not copy the bigarray and gives you a smaller representation which permits an access to a slice of the bigarray. An example is mirage-tcpip which introspects the TCP/IP packet by a succession of sub - which permits a zero-copy between the given packet and the application layer.
    • For this specific aspect, the reality is a bit more complex. Indeed, even if we want to allocate a smaller representation of the given bigarray (a slice), this representation will be allocated into the major heap (but I think it’s not true anymore due to this commit). This is why cstruct appeared as a solution to keep the ability to get some slices from a bigarray and allocate them into the minor heap (which is faster than the major heap). From that, a nice API exists now to manipulate bigarray and take this particular advantage.
  • Specialization on int32 Bigarray and int64 Bigarray is done by the compiler. That mostly means that if you manipulate such values, the compiler is able to avoid an extra allocation on the projection/injection of these values from/to the bigarray. Some calculation can becomes pretty fast instead of a int8 Bigarray with {get,set}_int{32,64} functions to be able to manipulate these values serialized into a certain form (endianness)
  • small bytes (less that Max_young_wosize = 256) are allocated on the minor heap which consists to “just” prepare a new block and shift the pointer of the stop-and-copy minor heap (which is pretty fast)
  • You can take the advantage of Bytes.unsafe_{of,to}_string to manipulate string (and avoid an illegal set via the type system) for free when, on the runtime, string and bytes have the same representation
  • if you want to mmap, you must use a bigarray
  • If you want to manipulate a shared buffer between multiple processes, you must use a bigarray - again, due to the fact that the GC will never move the buffer. This is what I try to do on my side about rowex, a small persistent index.

I think some others particularities exists but again, it really depends on what you want to do. For instance:

  • decompress (an implementation of zlib) uses bigarray because it’s fair to assume that the input buffer and the ouput buffer will have a looong life :slight_smile:.
  • on the opposite, digestif uses both types when it can be interesting to take the advantage about the GC lock (and the ability to release it) and it still is interesting to digest a simple string or small objects (in general).
  • I just start a draft to use bytes instead of cstruct/bigarray in mirage-crypto when I started to check the memory consumption of it which can put a huge pressure on the GC due the allocation via malloc of small objects (2 or 4 bytes).
  • Obviously, a library such as parmap must use bigarray as a shared buffer between processes and do a true parallelism.

Some questions can appear so from all of that:

  • can we functorize the code over a common interface between bytes and bigarray
  • can we use GADT to specialize some branches according to these values
  • should we just be arbitrary on our choice?

I would like to say that, from my experiments, OCaml is not really able to really specialize an implementation which uses a 'buffer via functors or GADTs. I know that you should have a better chance with flambda which is more aggressive than OCaml vanilla. But from my experience, it’s not a reluctant adoption point if you arbitrary choose bytes or bigarray as long as it is consistent with your usage - and this is where it becomes complex to fully describe what you need :slight_smile: .

But in anyway, it’s hard to have the best of both worlds into the same type. Many of these particularities are exclusive due to the underlying design of the caml runtime. So I continue to say that it depends :stuck_out_tongue: .

12 Likes