I’m trying to debug a performance problem currently. The issue is that I have several nodes talking to each other via Async Rpcs and a single node which acts as a gateway to clients.
When the gateway doesn’t communiate with any other nodes performance is in the ~100k ops/s range, however as soon as the node tries to insert another rpc all to another node before it replies to the client it, performance drops to ~400 ops/s.
Checking strace for long syscalls I’m getting futex calls which are apparently taking ~1s to complete. Is this normal? I’d presume that this would be caused by some contention over probably the Async lock, however I can’t see why there should be that much contention?