Debug logging I can compile-away in release

This is something I’ve been thinking of recently after starting to use Dune’s instrumentation a bit more, and I’m wondering if it exists or if anyone else would be interested.

I’d like a library which allows me to insert logging (by which I mean outputs strings of my choice + some information on the file/line it logged from, not necessarily different levels) which is completely removed at compile time unless I build with --instrument-with this-cool-logging-lib. An even nicer feature would be if it took a page out of landmarks and had an environment variable to select different “categories” of logging (so I could enable it for only some modules, for example, not necessarily severity levels).

By “completely removed” I am asking for there to not even be no-op function calls left over. In my current codebase I do this sort of debug logging using some runtime flags, and believe it or not the calls of debug_log which just return unit are starting to show up appreciably in my profiler.

Does this exist? If it doesn’t, would anyone else be interested in it? I’ve not really dabbled into writing PPXs myself but this seems like a reasonable way to get started.

1 Like

There’s a slogan in some parts of the commercial software world: “we debug what we ship”. It means that the product release already contains all debug hooks, including logging, and they simply need to be turned-on by configuration. Similarly, it’s well-documented that Google’s systems all run with all debug code compiled-in.

You might consider using that PPX to expand to code that checks various debug variables in order to decide whether to even invoke logging functions. As an example of this, Google’s “glog” on various languages supports this. A different way of thinking about it is: you want to minimize the cost of not logging a line. The lower that cost is, the more inactive log-lines you can afford to strew throughout your program, waiting for the need to arise to turn them on.

Also, you might want to look at Google’s Dapper for inspiration for logging. There, logging can be enabled based on a binary condition of a log-line being conditionally-enabled, and then the current “transaction” being enabled for logging. This amounts to a flag in thread-local state being enabled. It allows pervasive and low-cost tracing to be used for performance-analysis, among other things. I’ve done this sort of thing in Java J2EE (spit) applications, and it can be quite effectiive for finding performanace and correctness problems.

While I agree with generally keeping no-ops around that can be enabled through configuration, I am still curious if there exist simple ways to include/exclude branches through compilation flags. When I worked in embedded systems, we couldn’t even fit debug builds on our production hardware and had to use specialized hardware (meaning other small differences) to enable them. So I think there’s still a place for compile-time exclusion of debug code, personally.

This is not supported at the OCaml language level. But there are a couple of preprocessors out there that behave like the C preprocessor #ifdef. So maybe possible. But again, if binary size is that much of an issue you would probably be using C in the first place :slightly_smiling_face:

1 Like

A PPX rewriter could do this trivially. The actual important part is designing the logging library that the PPX-expanded code invokes.

1 Like

The motive isn’t binary size, it’s avoiding (possibly 10s of thousands) of calls to a function that looks like let debug_log x = if !debug then print_endline x else (), where x could be some not-cheap-to-evaluate string.

Using a PPX to move this if to the outside has some benefits, like avoiding the need to evaluate the argument, but still has a massive number of branches which will always either be all true or all false.

I’ve been dabbling in doing this today and it doesn’t seem like it’s too difficult to do in principle, but there are a lot of extra design considerations as @Chet_Murthy alludes to

IIRC you can get to something with two libraries debug and nodebug with a Mode module that has:

val debug : bool

set accordingly and have

if Mode.debug then … else (); 

in your code and link the right library according to your wish. You might need an flambda compiler for that though, I don’t remember.

(Pre-processors are not never so good ideas, in this case they don’t type check the code you exclude)

1 Like

You might be interested in the scheme used by Logs for logging, see the links in this message.

Yeah, this is exactly the issue I ran into trying this myself. I have a mostly-working implementation, but in the mode where the calls are compiled out it never checks that you’re still passing strings and not who-knows-what.

Yeah, using closures is definitely a reasonable approach. I know there’s also a ppx floating around which lets you skip writing the fun () -> ....

The reason I am not already using Logs is I have my logging divided not by severity level (debug, info, warning, etc) but by portion of the program.

Um, isn’t this easy to avoid? Suppose you have some expression E that does the logging job. So you generate

lf !debug then
  E
else let _ : unit -> unit = fun () -> E in ()

?
It’s dead code, so hopefully the compiler will eliminate it?

Maybe if not with a ref cell, then with a PPX that reads an environment variable. (And with the warning about dead code disabled…)

Surely ppxlib provides a way to register cmdline arg/hooks that can be passed to the rewriter ? I mean, how else could ppx_import possibly work ?

That did work!

I’ve put the (very sketchy, to be sure) attempts I’ve been making here: GitHub - WardBrian/togglelog: A PPX writer which lets you compile out logging statements in release builds

I’m not sure if I’ll pursue this further since it does seem like there are plenty of other ways to get reasonable performance out of logging

1 Like

After thinking about it for a while, I realized that I’d missed the obvious improvement:

let f = fun () -> E in
if !debug then f() else ()

and when you want to compile-out the logging (but keep the code for type-checking):

let f = fun () -> E in
()

Have you look at Logs Src (logs.Logs.Src)?

You should be able to have different logging level for different part of your program by using specific sources instead of logging in the default one.

2 Likes

A very dumb solution might be to wrap your logging code in asserts, and then pass in -noassert to ocamlc:

assert (Format.printf "debug print: %s" <...>;true);
4 Likes

It does, this is explained in its manual here.

ppx_import runs from within the compiler process, using the -ppx flag directly. So it does not need particular arguments, it uses the compilation environment.

Wait … really? I thought PPX rewriters ran as an outboard process, invoked by the compiler, but still, a separate process ? Does the compiler pass along the -I arguments to the rewriter ?

Ppx rewriters invoked by the compiler get as input an AST that starts with an attribute describing some of the compilation context. The -I arguments in particular are added to the compiler’s load path, which is then transmitted to the ppx through this attribute.

1 Like

The intended and historical way of using ppx rewriter is by passing the -ppx argument when compiling.
It is possible, instead, to create a separate process that will parse, rewrite (possibly several passes) and emit a marshalled AST.
This second approach is not possible for ppx_import since it requires access to the typing information of dependencies, and that is not available if you’re only parsing source code.