Debugging memory issues


#1

Hey all,

I’m working on a new compiler/transpiler written in OCaml and we’ve (unfortunately?) been using the Format library to emit the code that we’re generating. Today I’m just starting to run it on some huge test programs and finding that it’s taking up dozens of GB of memory pretty quickly and totally swamping the machine only a little ways into the program. I tried debugging with ocamldebug but I can’t find any specific code that’s repeating too much or anything like that - it just seems like something invisible is happening and my number one suspect is that it’s the building up of these long Format strings (with nested “%a” compositional formatting). But I’d like to figure out for sure so I don’t waste time switching away from the Format library or something if it’s an unrelated issue.

Any help appreciated! Thanks :slight_smile:


#2

You could give spacetime (https://blog.janestreet.com/a-brief-trip-through-spacetime/) a try. It’s a new ocaml memory profiler. You’ll need to install a spacetime-specific compiler switch though.


#3

In my experience, Format doesn’t leak memory (if you use fprintf into some IO channel; if you print into a buffer then the buffer might grow, of course). It’s pretty fast and very convenient. I second the idea of using some form of profiling such as spacetime to find where the leak comes from.


#4

I have not used it myself, but Obj.reachable_words reports the words reachable from a given value. If you suspect some value already to be responsible for the leak, you could observe it with this function.


#5

I have debugged quite a few things with https://github.com/jhjourdan/ocaml/tree/memprof . It is available in the opam repository as ocaml-variants.4.07.1+statistical-memprof you should use it with https://github.com/jhjourdan/statmemprof-emacs


#6

Thank you all! I got spacetime to work (ish - I’m on a Mac) and was able to see that the memory leak is coming from the Format library, though I’m not sure exactly how I can avoid it:

Time 1.314798 (1/1), Total 1429001264 live bytes [0x1085430f1;0x108542b47]                                                                                       
 65.75%   939524144b  buffer.ml:70,19--40                                                                                                                        
 34.25%   489477120b  bytes.ml:68,12--22     

Notably, when compiled with spacetime, it doesn’t allocate 20+GB of memory, it fails after a GB or two with an exception (to follow). I wonder if I’m just using the library wrong? Maybe there are rules about closing boxes in the same fprintf call or how to make multiple fprintf calls, one after another?

Uncaught exception:
  
  (Invalid_argument Bytes.create)

Raised by primitive operation at file "buffer.ml", line 70, characters 19-40
Called from file "buffer.ml", line 87, characters 34-46
Called from file "format.ml", line 343, characters 4-28
Called from file "format.ml", line 473, characters 6-72
Called from file "format.ml", line 480, characters 6-24
Called from file "format.ml", line 1197, characters 32-48
Called from file "format.ml", line 1187, characters 4-20
Called from file "format.ml", line 1199, characters 32-48
Called from file "format.ml", line 1248, characters 20-38
Called from file "lib/Stan_math_backend.ml", line 92, characters 2-129

Here is the code on line 92 of my file (apologies in advance):

let emit_for_loop ppf (loopvar, lower, upper, emit_body, body) =
  fprintf ppf "@[<hov>for (@[<hov>size_t %s = %a;@ %s < %a;@ %s++@])" loopvar
    emit_expr lower loopvar emit_expr upper loopvar ;
  fprintf ppf "@;<0 4>@[<v>%a@]@]" emit_body body

#7

Are you trying to allocate a huge (> 1GB) text inside one buffer? Also check whether you actually flush the formatter sometimes (with @. for a newline, @? for a mere flush). It’s still weird that the buffer grows that much.


#8

Should be a couple hundred lines of ASCII, maybe a few KB max. How often is sometimes? We aren’t flushing at all now except to collect the resulting string from Format.str_formatter. Doesn’t flushing close any open boxes?


#9
  • I’d recommend not using str_formatter, global mutable state is generally a code smell. Try allocating a buffer and use Format.formatter_of_buffer, or even Format.asprintf directly
  • do you use Buffer.contents or Format.flush_str_formatter? The former would be incorrect since it wouldn’t clear the buffer.

#10

Super weird. I think it was that we were using Format.str_formatter (just as an initial test that everything hooked up together correctly, I swear!). Using Format.asprintf "%a" <etc> doesn’t show the same ridiculous memory consumption and crashing. Thank you so much!


#11

:ok_hand: perfect. Now for optimal ergonomics I’d suggest trying the fmt library, or the one in containers.


#12

Thanks for the recommendation. What makes you like those APIs better? For example, we’re currently using the Format library’s sort of punctuation-based box API ("@[<v 2>%a@]" etc). Is that frowned upon, or does fmt improve upon it? Is there additional functionality you would highlight for us to look into? We’re still getting a feel for ocaml and its ecosystem, so any advice appreciated!


#13

I personally use a lot of the string-based box API, because it’s compact and quite readable once you’re used to it. But fmt and CCFormat (in containers) provide APIs with more sensible names than pp_print_string, support for colors, and a lot of combinators for lists, options, arrays, etc. You can do fine without them but they’re simply quite convenient.