Today I conducted a small experiment of using a debugger on a small OCaml program (built using dune
). The program is not written by me, does non-trivial things, and is written in such a way that my usual approaches to understand what is going on would require more work than I want to pour in it.
I took notes on this experience, in the hope that it could be of interest to others – maybe I’m doing things wrong and people will let me know, maybe this can help identify potential tooling improvements.
Disclaimer: I am a complete beginner as far as running OCaml debuggers goes. (I have used ocamldebug
and gdb
irregularly in the past, never heaily, and long forgotten how to use them.)
TL;DR:
Bytecode debugging with OCaml 5.3 and dune:
- works fine in Emacs/Tuareg, as it did in the past
- works okay in vscode using ocamlearlybird
- could be improved with a bit more targeted work,
some of it probably easy (and some of it hard)
If I understand correctly, no one is specifically working on this right now.
Let me take this occasion to thank the people who contributed to all these tools (Tuareg, ocamldebug, ocamlearlybird, vscode+ocaml integration, dune, etc.).
Why a debugger?
I am looking at an OCaml program that I did not write, and does interesting and complex things. I would like to build my understanding of how it works by observing the flow of values in some parts of the program, on concrete examples of interest.
I am unfamiliar with debuggers and tried other things first:
-
I considered modifying the code to print the values it encouters at runtime. But the program does not define pretty-printers for its values, and writing them is cumbersome. (I could probably use
deriving
to produce debuggers more easily.) -
My next move is usually
dune utop
: instead of running the program, I can call its library functions via the toplevel on small examples. But this particular program is only a binary, it was not split as a library and a binary, and splitting it would be non-trivial.
When “printf debugging” and “play in the toplevel” are not immediately within reach, it may be time to try a debugger. They should let us stop at a given point in the program, print values, and move around in the execution trace to better understand what is going on.
Running a debugger in general
To run a debugger on OCaml programs, one has to choose between a bytecode debugger, ocamldebug
, and native debuggers such as gdb
and lldb
.
Native debuggers are not OCaml-specific and likely to be better documented, have more integrated tooling etc., but they are more low-level and don’t know as much about OCaml programs; in particular they’re not so good at printing values.
On the other hand ocamldebug
can print OCaml values, and it is a time-travel debugger that supports going backward in time; but it relies on running the bytecode executable that is probably 10x slower than the native executable. It is also probably worse when debugging cross-language programs, for example using the FFI.
I would not try ocamldebug
to debug performance-sensitive programs, programs in production, and in particular to debug anything resembling a segmentation fault. But it should offer a nice experience for pure-OCaml programs during their development.
The Coq/Rocq maintainers have long been using ocamldebug
to understand their software, a large OCaml program with tricky bugs and non-trivial performance requirements. They rely on specific tooling to make it nicer – autoloaded scripts, customized pretty-printers. So there is evidence that ocamldebug
can work well when integrated inside a project development workflow. (Here the program I want to debug has not had any such written, so it will be more barebones.)
Getting a bytecode executable from Dune
Before going any further, you need to ask dune
to generate bytecode executables, by adding
(modes byte exe)
to the executable
stanza. Then you run dune build
, and when invoking the debugger you will need to manually pass the path to the bytecode program, for example _build/default/bin/main.bc
.
IDE integration
Running ocamldebug
directly is doable but not great. Just like it’s
nice when IDEs let you jump to the location of a compilation error,
you really want the debugger to show you “where” it is in the program
execution by showing you a program point in your programming
editor. (ocamldebug
will print the source line where it is at, so
it’s not too bad, but still noticeably less pleasant, and typing
movement commands one by one gets old fast.)
I considered two approaches to running a bytecode debugger for OCaml programs:
- run
ocamldebug
from Emacs/Tuareg - run
ocamlearlybird
from VsCode
ocamlearlybird in vscode
I first decided to use ocamlearlybird from vscode.
I opened vscode (which is not my usual editor) and tried to use Run > start debugging
directly… and it didn’t work well. You need to configure things manually, and the vscode interface did not tell me that, it would show nothing and appear not to work as expected but without much help.
The better way to configure vscode+earlybird is to… read the documentation first. I recommend:
- Read the vscode-ocaml-platform README.md about how to setup things.
- then read the ocamlearlybird README (which also links to the README above), in particular watch the short demo, to know what to expect when the interface works. The README documents the field of the
launch.json
file that you have to write to describe how to invoke the debugger, and this is helpful.
After reading this, I knew how to tweak the launch.json
file so that the debugger would pass command-line arguments to the program, and it started working correctly.
Unfortunately ocamlearlybird
does not current support time-travel (issue), so it is only possible to stop at breakpoints and move forward in time, while I was expecting to run until a failure and then go backward in time, as I usually do with ocamldebug
. At this point I decided to go back to my familiar Emacs.
Points to improve
When trying to “run the debugger” without having configured a specific bytecode program, the vscode UI appears to work but does nothing. For example it is possible to add breakpoints, etc., and then clicking “run” does nothing that I can see.
I wish there was clearer feedback when things are not setup and there is no chance that it will work. This would also be a good time to point me to the online documentation – from within the IDE – so that the process is more discoverable.
ocamldebug in Emacs
At this point I had already set things up to build a bytecode executable in Dune, so things were easy: M-x ocamldebug
and there you go. There is documentation in the user manual, which was probably written more than a decade ago, and it mostly reads just fine today.
(Note: some of the documented keybindings do not work: C-c C-k
is documented as stepping back in the manual, but it is not supported by Tuareg (issue).)
Moving around program execution is fun, printing values works okay – the next step for convenience would be to install custom printers to get nice output.
Points to improve
-
Emacs jumps to source code to follow the program execution in the debugger; but on every movement in the execution trace it asks me again whether I want
src/foo.ml
instead of_build/default/src/foo.ml
, and this is annoying. (Sometimes I did not observe this behavior, not sure why.) -
Dune includes various wrapping/mangling of module names that show up in the
ocamldebug
printing, and can be annoying at time. For example some module names show up asDune__exe.Foo
, and I would prefer to see justFoo
. I think it should be possible to hard-code some more de-Dune-mangling logic in the debugger’s pretty-printer, and ideally we could even make them user-configurable or dune-configurable a bit. -
If I print an AST from an execution point that does not have
open Ast
in its typing environment, the AST is printed likeDune__exe.Ast.Let (Dune__exe.Ast.Var "x", ...)
. It would be nice to omit theDunne__exe
part, but ideally I should also be able to tell the debugger: “let’s openAst
locally from now on when you print values”, so that it prints in a more readable way by default.