It’s worth digging in here to find out precisely what the problem was, since it’s difficult to beat a kernel-spliced socket in userspace. It could be that you were trying to go from a socket->socket. sendfile(2) has a few restrictions and has a reputation for being unreliable under some circumstances. You could also try splice(2) and its associated functions directly: splice(2) - Linux manual page
If all that fails, then there are examples of ring buffers using the Xen device protocol at GitHub - mirage/shared-memory-ring: Xen-style shared memory rings, or a disk-persistent version (probably too conservative for your usecase) at GitHub - mirage/shared-block-ring: A simple on-disk fixed length queue. Both will have examples of Lwt patterns for interleaving.