We are happy to announce the release of two new opam packages: qcheck-lin and qcheck-stm for black-box property-based testing of
module interfaces under parallel usage. Both these libraries
build on top of QCheck - hence their names.
qcheck-lin requires little more than an interface
description. It allows to test a library for sequential
consistency, that is, whether results obtained from using it in
parallel agree with some linear, single domain execution.
qcheck-stm is a model-based, state-machine framework for both
sequential and parallel testing. It allows to test an
imperative interface against a pure model description, and
thereby allows to express and test intended behaviour beyond a
signature description.
We presented preliminary work on both these libraries at the
OCaml Workshop 2022 earlier this year. The libraries furthermore
underlie our work to test the new multicore runtime of OCaml 5.0,
and have helped us identify a number of issues.
More information is available from the GitHub repository:
Interesting, this looks like a nice addition to my testing âtoolboxâ.
Iâve been using all of crowbar, qcstm, monolith and each has its own strenghts (what sparked my interest was in fact the presentation about qcstm from an earlier ICFP):
monolith:
is like qcstm but integrates with afl with speeds up finding bugs a lot on multicore machines
it doesnât integrate nicely with testcase shrinking (you can use afl-tmin but there is a bit of additional machinery to integrate all that into an efficient bugfinding cycle)
qcstm:
comparison with a reference implementation
shrinking helps produce understandable and actionable bugreports
doesnât integrate with afl, so once youâve run out of the low hanging fruit to fix it takes longer to find more bugs
crowbar
integrates with afl
lacks the state machine exploration/reference implementation comparison that monolith and qcstm provide
I havenât tried it yet, but qcheck-stm looks like a nice evolution of qcstm that addresses one of its disadvantages compared to crowbar/monolith: the lack of parallelization.
Could you please clarify the connection between qcstm and qcheck-stm though? I see the README reference qcstm and some of the authors are shared, however the new qcheck-stm has a lower version number than the old qcstm.
Based on " STM contains a revision of qcstm" Iâd assume that qcheck-stm is a superset of qcstm, and qcstm is deprecated in favour of qcheck-stm?
FWIW both qcstm and monolith have helped discover some bugs and security issues in oxenstored (the ocaml implementation of âxenstoredâ in Xen) a few years ago.
My initial intent was to compare against a reference implementation, not specifically to look for security bugs, and the first bug found was a regular bug:
qcstm/monolith didnât find these bugs directly, at least not initially, but they pointed out enough suspicious behaviour in that code area that tweaking the fuzzer would then find the security bug seen via manual review.
Iâm curious about shrinking for Monolith. If I understand correctly, currently the only form of shrinking/reduction proposed by Monolith is to use afl-tmin, but (according to your experience) this does not work so well. Have you talked to upstream (@fpottier) about your need for shrinking?
(Bonus question: do you think that Citrix might be willing to fund work to add shrinking to Monolith?)
In my experience shrinking is actually both delicate and a fun/interesting problem to work on. (From the library-user perspective, writing user-defined shrinking logic is also interesting and benefits from good support from the library.) Qcheck recently changed its approach to shrinking, it uses âintegrated shrinkingâ now, and it was a lot of work.
Thanks for your kind feedback Edwin. I didnât know of any real qcstm users so youâve made my day!
qcheck-stm is indeed intended as a replacement of qcstm, which was missing the ability to run parallel tests. The interfaces required are largely the same, except weâve had to split qcstmâs run_cmd into two separate operations run and postcond in qcheck-stm in order to make parallel testing work. It should therefore be relatively easy to port tests from qcstm to qcheck-stm. We did so ourselves for some of the example qcstmStdlib tests to stress test the standard library under parallel usage with OCaml 5.
qcheck-stm and qcstm are (as you point out) based on black-box property-based testing - whereas crowbar and monolith are coverage-driven (âgrey-boxâ) driven by AFL-instrumentation. Each of these have their advantages as you also point out. In the present case, the black-box approach was relatively easy to get to work with non-deterministic parallel code in the style of Erlangâs QuickCheck and its derivatives.
In the longer term, Iâm hoping that we can (eventually) unite these efforts in a full-featured, common testing library for OCaml
afl-tmin does provide a smaller testcase, (that may be minimal from a number of operations point of view), but it doesnât provide a âniceâ testcase. E.g. it probably has some randomly generated numbers, or some ugly binary characters in a string, where the bug may also be reproducible with something simpler, like â0â, â1â, or âaâ.
And âqcstmâ was a lot better at finding those nice to read testcases (obviously there could be bugs that depend on the actual value of integers/strings, not just on operation types/order, but theyâre more rare).
I havenât spoken with upstream about this because at the time I was developing code that used qcstm/monolith the actual issue being fixed was under a security embargo, so it wouldâve been difficult to collaborate if I couldnât show any examples. All oxenstored related security embargoes (that Iâm aware of) have now expired and the security fixes themselves are all public as of November, so there is an opportunity for some discussion/improvements in this space now.
Iâve worked around the problem by writing a crowbar wrapper with a qcstm-style API though, which might be useful for the unified efforts?
That is a complicated question to answer at this point, the company is undergoing some transformation post the take-private/merger that was completed a few months ago, and XenServer has split off from Citrix: as you can see on cloud.com it is a sibling to it under the new corporate parent. Iâll wait until some of the internal business changes settle internally to see whether the new corporate entity/unit could/would sponsor external work, it is not as easy as you would think (it is not just a matter of finding a budget). I can try raising this again internally in a couple of months.