As a near-term solution that offers time travel but not fully interactive debugging, I think this would be super useful. This could be solved entirely on the dune side, by providing some way to autoload printers that follow a naming convention like the one in deriving_show. To me the most annoying thing in debugging is all the work required to get ocamldebug to print something useful.
I should mention that my use-case (a game engine) uses tsdl and Owl, and one of them (canât remember which off-hand) simply doesnât want to work in bytecode mode, so itâs not an option.
This looks very promising!
I am going to try it out this week
Also⌠down
is really nice (at least once I realised I need #use "topfind"
in my ~/.ocamlinit
) âŚturns the base ocaml
shell into something that feels much more modern and friendly. Was a new discovery to me via one of the links above.
I also noticed that ocamldebug just bails out if thereâs Lwt involved, and ppx_debug from above says something about no concurrency supportâŚ
But if itâs only concurrency and not parallelism then interactive debugging should still be possible?
In Python I canât await
an async
function from pdb prompt, but I can set breakpoints and navigate the stack, step forward etc no prob.
Might be useful to think about the problem a bit more generally than âdebuggingâ. Observability might be a better term, where debugging is just one way of observing the behavour of a program: there are other tools available too.
Debugging concurrent (or distributed) programs can be challenging with âprintfâ style debugging, but not impossible, make sure you log thread id, and have a unique identifier that identifies related pieces of computation that you prefix or include in your log messages in some form. Logs can then be post-processed to extract a per-request view of what happened (e.g. in the case of a web-server), sometimes this âpost-processingâ is done with grep
(or ripgrep
if youâve got gigabytes to wade through).
This works (it is the main form of debugging the XAPI projectâŚ), but can be quite tedious, and can be problematic if there is more than a single request to follow (e.g. if you call into other threads or external components, or same program running on another host, etc.).
Weâve been experimenting lately with OpenTelemetry (there is an ocaml library that supports the format), and I think that might be the answer to being able to trace large, distributed (or concurrent) systems. It is an open format with various visualization tools to chose from once a trace has been recorded, and this tracing can be turned on/off at runtime, you can choose what to sample (everything, everything that originates from a certain API call, etc.), you can later on batch or post-process/filter the captured trace, etc.
It (currently) requires you to manually instrument your (API) entrypoints, and then you get a hierarchic/nested view of how an API call was handled, including all the nested API calls that one made, and the time it took for each of them (allowing you to more easily spot bottlenecks), and what logs each call made. All that is required to make that work is a library supporting the format, and modification to the code to plumb through a âcontext idâ, which may seem quite an invasive change to the code, but may be worthwhile in the flexibility and observability you gain. Since the format is open this also works cross-language (if you have multiple components written in different languages all interacting with each-other).
When this works, it is promising to be better than a visual debugger in some sense.
Although it doesnât allow you to break into and inspect and modify state that you didnât add code to inspect ahead of time, quite often I find that such breakpoints would be difficult to use in a distributed system (the other part of the system would want to carry on, timeout, etc. while youâre blocked in the debugger), and some bugs/issues/race conditions can only be observed by recording the events as they happen and later inspecting it. Obviously there is the danger that recording the events prevents it from being reproducible (a heisenbug), but most bugs arenât like that.
Beyond âprintfâ debugging and tracing (whether with OpenTelemetry or otherwise) there are 2 other âtraditionalâ debugging techniques that woredk quite well with OCaml for me:
-
timer based statistical stack sampling (âflamegraphsâ). If things are stuck, or just a lot slower than expected it is very useful to visualize where, and as long as you tell it to use DWARF info to unwind the stack samples it works reasonably OK with OCaml, and can at least pinpoint functions that may be unexpected bottlenecks. Although it might be useful to build something into your program that allows you to query it to tell you what âhigh levelâ operation each thread is doing, to allow you to identify things that are âstuckâ more easily.
Recently fixed a bug turning a function that used ~18% CPU into 0.02% CPU that way, buried deep in an OCaml-C binding library where noone wouldâve expected to look for such bugs (the bug was in the OCaml code), and further investigation revealed that there was a correctness bug hiding behind the performance bug too (this time in the C code it was interfacing withâŚ) -
the new memory profiling introduced in latest versions of OCaml can sometimes point out surprising properties for your code, e.g. places where allocations happen way too often, which may not be a bug, but is a candidate for optimization/simplification and can benefit both performance and correctness (if you were not expecting that piece of code to allocate that much, what else have you missed about the codeâs behaviour?)
And finally while this wonât help you debug things âin productionâ writing a small unit test to exercise the property youâre looking at, and then loading the code up with dune utop
can help in understanding the codeâs behaviour (and when youâre done you also have a regression test you can use so you donât reintroduce the bug youâre fixingâŚ).
If you want to take things a step further you can write a quickcheck-style property test (there are various implementations for this in OCaml: crowbar, monolith, qcheck-stm, etc.), which will find a bug with a (minimized) input, and buggy output all ready for bugfixing. This is more bug-hunting (e.g. in a particularly unstable/untrustworthy looking piece of code) than debugging.
Also when designing your program it is useful to keep in mind debuggability/observability: if this goes wrong, do I have all the information needed to debug this? Should I log something, and if I do is the log message unique enough? (using one of the FUNCTION, .etc. can help here)
If I want to turn logging on/off at runtime can I do that easily and reliably?
If I want to load this up and debug it in utop can I do that, or did I put too much logic into a single huge function where I cannot inspect intermediate state from utop
?
I donât think tracing is always a valid replacement to debuggers, but
itâs certainly useful, especially for performance issues.
Aside from opentelemetry (which is good for networking and distributed
systems), for local programs that are CPU heavy or interactive, like
games, I recommend Tracy (GitHub - wolfpld/tracy: C++ frame profiler) for which we
at Imandra have a small OCaml library
(GitHub - imandra-ai/ocaml-tracy: Bindings to the Tracy profiler). Itâs super lightweight at
runtime and is widely used for video games performance analysis. Itâs
based on manual instrumentation (no context to carry around). It can
also sample stack traces (although iirc in OCaml theyâre not great).
Interesting. Is the recommended way to use it a Git submodule of that repository within a project? Is there anything special needed to set it up in term of dune configuration?
Interesting discussion, for sure. All these things are nice to have and use but as c-cube said, they do not replace the debugger. Most of the time you need to execute step-by-step your program, inspect the variables as they change, see the execution flow etc. It is just very handy (and time saving) to do this from inside your editor/ide.
When I found earlybird for VS Code, I assumed that I will have a dream dev environment around OCaml. But alasâŚ
I donât know what is causing this problem for you but I am able to debug in ocamldebug
even if Lwt
is involved.
The issue maybe multiple system threads â If I remember correctly the debugger does not like that and Lwt may have created one.
Also you may want to try OCaml 4.14 / 5.0.0~beta2 and see if the problem still occurs for you. There were some issues with older versions of ocamldebug â it simply refused to work if there were system threads. In OCaml 4.14, ocamldebug will work until a new system thread is spawned â anyways this is just my memory and I am probably wrong about some fine details.
TL;DR â use OCaml 4.14+, you should be able to debug a single (system) threaded program that uses Lwt
.
Yes, it was that.
I think I was on 4.13 at the time so if there were fixes for this area in 4.14 Iâll try it again, thanks!
If I remember correctly, you just have to opam pin
it with a git path, and it should work. There is no particular dune magic to use it.
It uses a dune virtual library for tracy
(which is the lightweight instrumentation side, pretty low cost), which is what you can use in most of your code. Then, in your main binary, you add tracy-client
as a dep and thatâs where the bindings come into play, if you call Tracy.enable()
. Thereâs an example that does exactly that.
Does it require an OCaml compiler with frame pointers to output useful instrumentation?
Itâs based on manual instrumentation, so not necessarily. You can add scopes where you want.
If the tracy process has enough permissions (e.g. ran as root) it can also capture a lot more data by itself, including stack samples, in which case I think it benefits from debug symbols. Itâs a tool primarily designed for C++ and the likes. Quoting the manual:
On gcc or clang remember to specify the debugging information -g parameter during compilation and
do not add the strip symbols -s parameter. Additionally, omitting frame pointers will severely reduce the quality of stack traces, which can be fixed by adding the -fno-omit-frame-pointer parameter. Link the executable with an additional option -rdynamic (or --export-dynamic, if you are passing parameters directly to the linker).
Itâs able to read OCaml debug info though; see here in the middle, it points to the line instrumented in the example program.
Here is what I find useful.
$ cd _build/default
# run ocamldebug -- for your specific example this would be:
$ ocamldebug tests/test_oktree.bc
$ info modules
The advantage to running from within _build/default
is that ocamldebug
seems to find paths more easily.
Also in dune-project
file I use the (wrapped_executables false)
stanza. It avoids me to do weird things like break @Dune__exe__SomeModule 42
and simply do break @SomeModule 42
Thanks, Iâll try these tips!
I think they highlight that itâd be nice to have dune handle all these awkward details and âoptional but kind of necessaryâ config hacks though
<slight_offtop>
Some time ago, I watched beginning of âHandmade heroâ series, where the speaker claimed that (in 2015) GNU/Linux has much less convenient debuggers than Windows. It would be great to research that and understand what may be improved in general, and maybe it would be great if OCaml will have the most powerful debugger comparatly to all available GNU/Linux languages/infrastructres.
</slight_offtop>
@Kakadu I fully agree. Not having a debugger is a show stopper, at least for people that are used to that for decades.
The OCSF funded some of work on ocamlearlybird IIRC. So there are some resources put on the problem. But if there is no one willing to work on such a project for a long term then things wonât get better.
@Khady this OCSF funding on ocamlearlybird is a recent thing?
One question, some people mentioned that itâs possible to use gdb with OCaml, although not as convenient because you need to understand how OCaml objects are represented at runtime, and all type information is stripped away.
Having said that, there must be some graphical interfaces for gdb, even on Linux, right? Could those be used with OCaml?