How to set up unit testing in 2023

On the problem of testing a module’s internals, which is only currently possible with inline tests, RWOC makes an argument here under “Where Should Tests Go?” , quoted below, that internal test-only libraries should almost always be used instead. I agree with that and thought it would be good to post here. I’m trying to avoid using any inline tests.

From RWOC:

Putting tests directly in the library you’re building certainly has some benefits. For one thing, it lets you put a test for a given function directly after the definition of that function, which in some cases can be good for readability. This approach also lets you test aspects of your code that aren’t exposed by its external interface.

While this sounds appealing at first glance, putting tests in libraries has several downsides.

  • Readability. Including all of your tests directly in your application code can make that code itself harder to read. This can lead to people writing too few tests in an effort to keep their application code uncluttered.
  • Bloat. When your tests are written as a part of your library, it means that every user of your library has to link in that testing code in their production application. Even though that code won’t be run, it still adds to the size of the executable. It can also require dependencies on libraries that you don’t need in production, which can reduce the portability of your code.
  • Testing mindset. Writing tests on the inside of your libraries lets you write tests against any part of your implementation, rather than just the exposed API. This freedom is useful, but can also put you in the wrong testing mindset. Testing that’s phrased in terms of the public API often does a better job of testing what’s fundamental about your code, and will better survive refactoring of the implementation. Also, the discipline of keeping tests outside of requires you to write code that can be tested that way, which pushes towards better designs.

For all of these reasons, our recommendation is to put the bulk of your tests in test-only libraries created for that purpose. There are some legitimate reasons to want to put some test directly in your production library, e.g., when you need access to some functionality to do the test that’s important but is really awkward to expose. But such cases are very much the exception.

4 Likes

Thanks for all your contributions, guys. I’ll use this info to update the ocamlverse page.

I decided to go with ppx_expect and ppx_inline_test, but to use both in external test modules the same way Jane Street does. ppx_expect has the advantage that debuggers have over compiling cycles with printfs: you can look at the entire state at once without having to reason about specific conditions, and that allows you to be both exploratory and more comprehensive in your testing. I think ppx_expect and its brethren are generally superior to plain old unit testing. Of course, this assumes you have the printers for everything. In my case, I use ppx_yojson everywhere, so I can easily print the state of just about everything in my program. Why do I still need ppx_inline_tests? There are some things that I don’t have printers for. For example, some functions return a slew of polymorphic variants signifying different results, and I don’t want to limit those variants and place them in a specific type, so I can’t derive their printer with ppx_yojson. For those specific cases I still use the more limited approach of writing specific condition tests, and it feels far more painful.

The one annoying thing I encountered was that neither ppx_expect nor ppx_inline_tests work for executables. I had to split my executable up into a library part that contains all the stuff, and an executable wrapper that simply calls main () on the library. Not a huge deal, but this could be improved.

3 Likes

Note that for executables, dune also exposes cram tests (" expectation tests written in a shell-like syntax").

Also less on topic as it’s not unit testing, property based testing as exposed by qcheck is really nice: if you have some properties in mind about your functions that you want to check, instead of picking some arbitrary input, let qcheck do it for you with many random inputs.

2 Likes

In my case, the executable is a game, so it’s not about executing command line arguments.

1 Like

Have you heard about handmade hero?

It’s not OCaml (it’s C), but maybe that resource could give you a few ideas on how to hunt and squash bugs within your game.

The author that does not do testing but he creates custom debug tools, see an example here: Reenabling More Debug UI | Handmade Hero Episode Guide | Handmade Network

A downside of ppx_expect is the output channel capture. It makes it a bit harder to debug e.g. infinite loops.

1 Like