How to robustify Cram tests

Our usage of Cram tests in our codebase is steadily increasing, and with it, the issues related to not-well-written tests also increase.

We are looking for ways to help ensure our tests are more robust. Note that this is being applied to a largish codebase, so the issues here are probably not relevant for most users.

One common problem that happens is:

  • we have our tool (let’s call it fooc) already installed in the PATH;
  • we create a new Cram test where we call $ fooc <options> somewhere;
  • we forget to add a dependence towards fooc in a dune file related to the test (possibly in a parent directory);
  • dune runtest works because the shell sees the installed fooc binary, uses it, and things seem to work;
  • we modify the code of fooc and re-run the test, and things no longer work, because the new fooc was not used.

So, we now are resorting to dune clean && dune runtest, to help catch this situation.

But it’s still not enough, because sometimes the error is not deterministic. Also, running dune runtest <test> does not always result in the same errors as dune runtest.

Are there some best practices that can help detect/debug this situation? For instance:

  • Partially emptying the environment before a Cram test; maybe clearing PATH?
  • “Poisoning” the environment, i.e. adding some extra directory to PATH that will lead to errors if some non-compiled-here tool is being used?
  • adding set -x or strace, or something to a test, then checking if it finds external dependencies that should not have been used?

I guess the ultimate solution would be to run the Cram tests inside a minimal Docker container… is this currently used by someone?

Any other sandboxing ideas?

Yes, it is not the same. If you want to run a single test you have to run dune build @that-test-without-t. I also think it is very confusing.

As with your PATHissue, that’s very weird. In Dune we have a bit over five thousand Cram tests, and none of them has this problem. I took a look and I don’t think we do anything specific there, dune automatically prepends the install location to the PATH when running the test (this is usually _build/install/default/bin). Maybe worth exploring why that doesn’t seem to be the case for you?

So, we now are resorting to dune clean && dune runtest, to help catch this situation.

Consider writing (cram (deps %{bin:fooc})) where you’ve defined your cram test. That will make them re-run whenever fooc changes.

I do have plans to improve sandboxing this year to help find some of the issues. Stay tuned.

2 Likes

The initial message was wrong when describing one part of our issues. We were having different results when running dune clean && dune build @tests/runtest-foo and dune clean && dune build @tests/runtest on a test using the test stanza. But we figured out we needed a deps to the install alias here for internal reasons we have not yet fully comprehended.

Many of our issues comes from two major points:

  • We have 4 different test methods : cramtest, dune test stanza, inline tests and an internal testing tool for another kind of tests
  • Our tool has a “kernel” part and plug-ins, with one package for each. It is quite easy to forget a plug-in when describing dependencies, and a dependency to the kernel is not enough

How are you describing dependencies on the kernel and plugins currently?

For cramtests we used the usual

(cram
  (applies_to my_test)
  (deps
    (package kernel)
    (package plugin1) 
    …
))

And thought

(test
 (name my_other_test)
 (libraries fooc.init fooc.kernel)
 (flags :standard -open Fooc_kernel))

Was enought, but it turns out we also needed (deps (package fooc)) here, we thought having libraries would be enough to creating automatic dependencies but we were wrong. This dep replaces the dep to install mentionned above.