two thoughts (from back in the day, been a long time since I had to find one of these):
(1) have you tried valgrind?
(2) set things up so that when/if the problem occurs, you’ll get a full (not truncated core-dump. You can test out whether it’s set up right, by sending the SEGV signal with "kill -SEGV ". Once you can load that core into gdb and walk around, then wait for a real failure.
Also, limit the amount of memory your program can use – maybe that’ll make it faster to produce the crash?
What native libraries are you using? a look at /proc//maps should tell you, for starters
ETA: Also, IIRC there are memory-errors that no process can catch – so might be worthwhile looking into what your OS records about such errors. I know on Linux there is some provision for capturing minimal information, controllable by something-or-other via sysctl. Also, turn on GC logging, see if that shows you anything useful (can’t hurt, might tell you something).
ETA2: yet another: how frequent is the failure? The more you can do, to make that failure more frequent, the better. So it’s worth keeping track of what you do to run the program, and how long it takes to fail, so you can try to reduce that time. Low MTBF is your friend, when debugging what appears to be a Heisenbug.