Bytecode debugging in OCaml 5.3

Today I conducted a small experiment of using a debugger on a small OCaml program (built using dune). The program is not written by me, does non-trivial things, and is written in such a way that my usual approaches to understand what is going on would require more work than I want to pour in it.

I took notes on this experience, in the hope that it could be of interest to others – maybe I’m doing things wrong and people will let me know, maybe this can help identify potential tooling improvements.

Disclaimer: I am a complete beginner as far as running OCaml debuggers goes. (I have used ocamldebug and gdb irregularly in the past, never heaily, and long forgotten how to use them.)

TL;DR:

Bytecode debugging with OCaml 5.3 and dune:

  • works fine in Emacs/Tuareg, as it did in the past
  • works okay in vscode using ocamlearlybird
  • could be improved with a bit more targeted work,
    some of it probably easy (and some of it hard)

If I understand correctly, no one is specifically working on this right now.
Let me take this occasion to thank the people who contributed to all these tools (Tuareg, ocamldebug, ocamlearlybird, vscode+ocaml integration, dune, etc.).

Why a debugger?

I am looking at an OCaml program that I did not write, and does interesting and complex things. I would like to build my understanding of how it works by observing the flow of values in some parts of the program, on concrete examples of interest.

I am unfamiliar with debuggers and tried other things first:

  1. I considered modifying the code to print the values it encouters at runtime. But the program does not define pretty-printers for its values, and writing them is cumbersome. (I could probably use deriving to produce debuggers more easily.)

  2. My next move is usually dune utop: instead of running the program, I can call its library functions via the toplevel on small examples. But this particular program is only a binary, it was not split as a library and a binary, and splitting it would be non-trivial.

When “printf debugging” and “play in the toplevel” are not immediately within reach, it may be time to try a debugger. They should let us stop at a given point in the program, print values, and move around in the execution trace to better understand what is going on.

Running a debugger in general

To run a debugger on OCaml programs, one has to choose between a bytecode debugger, ocamldebug, and native debuggers such as gdb and lldb.

Native debuggers are not OCaml-specific and likely to be better documented, have more integrated tooling etc., but they are more low-level and don’t know as much about OCaml programs; in particular they’re not so good at printing values.

On the other hand ocamldebug can print OCaml values, and it is a time-travel debugger that supports going backward in time; but it relies on running the bytecode executable that is probably 10x slower than the native executable. It is also probably worse when debugging cross-language programs, for example using the FFI.

I would not try ocamldebug to debug performance-sensitive programs, programs in production, and in particular to debug anything resembling a segmentation fault. But it should offer a nice experience for pure-OCaml programs during their development.

The Coq/Rocq maintainers have long been using ocamldebug to understand their software, a large OCaml program with tricky bugs and non-trivial performance requirements. They rely on specific tooling to make it nicer – autoloaded scripts, customized pretty-printers. So there is evidence that ocamldebug can work well when integrated inside a project development workflow. (Here the program I want to debug has not had any such written, so it will be more barebones.)

Getting a bytecode executable from Dune

Before going any further, you need to ask dune to generate bytecode executables, by adding

 (modes byte exe)

to the executable stanza. Then you run dune build, and when invoking the debugger you will need to manually pass the path to the bytecode program, for example _build/default/bin/main.bc.

IDE integration

Running ocamldebug directly is doable but not great. Just like it’s
nice when IDEs let you jump to the location of a compilation error,
you really want the debugger to show you “where” it is in the program
execution by showing you a program point in your programming
editor. (ocamldebug will print the source line where it is at, so
it’s not too bad, but still noticeably less pleasant, and typing
movement commands one by one gets old fast.)

I considered two approaches to running a bytecode debugger for OCaml programs:

  • run ocamldebug from Emacs/Tuareg
  • run ocamlearlybird from VsCode

ocamlearlybird in vscode

I first decided to use ocamlearlybird from vscode.

I opened vscode (which is not my usual editor) and tried to use Run > start debugging directly… and it didn’t work well. You need to configure things manually, and the vscode interface did not tell me that, it would show nothing and appear not to work as expected but without much help.

The better way to configure vscode+earlybird is to… read the documentation first. I recommend:

  1. Read the vscode-ocaml-platform README.md about how to setup things.
  2. then read the ocamlearlybird README (which also links to the README above), in particular watch the short demo, to know what to expect when the interface works. The README documents the field of the launch.json file that you have to write to describe how to invoke the debugger, and this is helpful.

After reading this, I knew how to tweak the launch.json file so that the debugger would pass command-line arguments to the program, and it started working correctly.

Unfortunately ocamlearlybird does not current support time-travel (issue), so it is only possible to stop at breakpoints and move forward in time, while I was expecting to run until a failure and then go backward in time, as I usually do with ocamldebug. At this point I decided to go back to my familiar Emacs.

Points to improve

When trying to “run the debugger” without having configured a specific bytecode program, the vscode UI appears to work but does nothing. For example it is possible to add breakpoints, etc., and then clicking “run” does nothing that I can see.

I wish there was clearer feedback when things are not setup and there is no chance that it will work. This would also be a good time to point me to the online documentation – from within the IDE – so that the process is more discoverable.

ocamldebug in Emacs

At this point I had already set things up to build a bytecode executable in Dune, so things were easy: M-x ocamldebug and there you go. There is documentation in the user manual, which was probably written more than a decade ago, and it mostly reads just fine today.

(Note: some of the documented keybindings do not work: C-c C-k is documented as stepping back in the manual, but it is not supported by Tuareg (issue).)

Moving around program execution is fun, printing values works okay – the next step for convenience would be to install custom printers to get nice output.

Points to improve

  1. Emacs jumps to source code to follow the program execution in the debugger; but on every movement in the execution trace it asks me again whether I want src/foo.ml instead of _build/default/src/foo.ml, and this is annoying. (Sometimes I did not observe this behavior, not sure why.)

  2. Dune includes various wrapping/mangling of module names that show up in the ocamldebug printing, and can be annoying at time. For example some module names show up as Dune__exe.Foo, and I would prefer to see just Foo. I think it should be possible to hard-code some more de-Dune-mangling logic in the debugger’s pretty-printer, and ideally we could even make them user-configurable or dune-configurable a bit.

  3. If I print an AST from an execution point that does not have open Ast in its typing environment, the AST is printed like Dune__exe.Ast.Let (Dune__exe.Ast.Var "x", ...). It would be nice to omit the Dunne__exe part, but ideally I should also be able to tell the debugger: “let’s open Ast locally from now on when you print values”, so that it prints in a more readable way by default.

11 Likes

Nice post, thanks !
A few things I would like to add:

  • Time travelling is possible for the native debuggers with rr. At some point it was Linux-only, it might still be the case, but it’s very nice to use. I have on some occasions debugged bytecode programs by using rr on it, and with the appropriate gdb/lldb macros to print OCaml values it can be useful (but mostly for debugging the C parts; for problems purely on the OCaml side ocamldebug is still better suited). I use it regularly for native debugging and it’s very convenient (it can even help with debugging eisenbugs in parallel programs ! Just run rr record ./my_program several times until the bug triggers, and then rr replay will always replay the same run, including thread interleavings, consistently reproducing the bug).
  • I have tried time travelling with ocamldebug in the past and I have hit some serious issues: limited history means that you cannot go very far in the past, and the way it works (by setting checkpoints and replaying from the checkpoint to the required instruction) means that you can often see weird artifacts due to C calls being replayed each time you step back, sometimes breaking the program completely. I’m curious to know if this is just bad luck (or me doing weird things), or if you had similar issues too.
  • The Dune__exe stuff is, I believe, dune’s misguided attempt to shield users from potential conflicts between files linked in the executable and modules with the same name present in non-wrapped libraries required as dependencies. I suspect that (wrapped false) or something like that in the section of the dune file corresponding to the executable will get rid of it.
3 Likes

It is possible to use ocamlearlybird with dap-mode in Emacs link. The setup uses the same json config file as VSCode. I’m putting my effort into DAP support since that gets cross editor support and I can switch between LLDB/ocamlearlybird.

For Emacs the two main options for DAP support are:

  • dap-mode, which ties into lsp-mode and follows that style of things. Uses JSON configuration based off VSCode configuration. The UI elements depend on lsp-mode, so it’s a heavier setup and might not play as well with eglot.
  • dape, standalone DAP mode with a more minimal approach. I didn’t get it working satisfactorily but it seems closer to eglot in philosophy GitHub - svaante/dape: Debug Adapter Protocol for Emacs

For both I see the challenges are:

  1. Setting up DAP itself reliably and with less fuss. It could be smoother and better documented.
  2. Setting up dune builds to generate the right artifacts. Having a direct LSP code action to run a debugger against a particular executable like Rust does would be ideal.
  3. Bugs in ocamlearlybird and lack of maintainer time.

It’s interesting to hear about users of bytecode debugging, I thought there wouldn’t be many people using that.

I personally have never managed to use ocamldebug satisfactorily, but I
wish support for gdb would become a thing one day. I think there’s ways
to provide (python) scripts to teach gdb about the layout of values,
possibly even about calling conventions(?). gdb is overall more
interesting because it works on your actual code (native, not bc) and
handles threads, breakpoints, etc. in most editors/IDEs.

My 2c :slight_smile:

This is controlled by (wrapped_executables <bool>) in dune-project: wrapped_executables - Dune documentation

Cheers,
Nicolas

you can often see weird artifacts due to C calls being replayed each time you step back, sometimes breaking the program completely.

If I understand correctly, each checkpoint is an independent copy of the program state obtained via fork(). This means that C calls should work fine (eg. open file descriptors are duplicated when forking), but threads may not work. The scenario I can see where the program behavior would be affected is if your use of I/O assumes that you are the only program touching certain foreign resources – for example you created a temporary file and you assume that you own it uniquely. With checkpoints (that is, multishot continuations), you can observe several copies of the program acting on the same file concurrently.

Probably worth sharing here:

2 Likes

For those who (like me) have always been perplexed when they try to use ocamldebug, b/c it just doesn’t seem to work like gdb or perl’s debugger, or any other debugger I’ve tried, this video was immensely helpful.

Specifically, I learned that at the beginning of the program, to set a breakpoint in the main program file, I need to do something like (for a file “extract_environments.ml”, and line 79):

break @extract_environments 79

That was blocking me FOREVER, b/c holy cow there’s a ton of initialization, and I just didn’t know how to run thru all that and get to the first meaningful line of my main program.

Now I do. I can move forward from here with ocamldebug. Finally!

3 Likes