[ANN] qcheck-lin and qcheck-stm 0.1.0

We are happy to announce the release of two new opam packages:
qcheck-lin and qcheck-stm for black-box property-based testing of
module interfaces under parallel usage. Both these libraries
build on top of QCheck - hence their names.

  • qcheck-lin requires little more than an interface
    description. It allows to test a library for sequential
    consistency, that is, whether results obtained from using it in
    parallel agree with some linear, single domain execution.

  • qcheck-stm is a model-based, state-machine framework for both
    sequential and parallel testing. It allows to test an
    imperative interface against a pure model description, and
    thereby allows to express and test intended behaviour beyond a
    signature description.

We presented preliminary work on both these libraries at the
OCaml Workshop 2022 earlier this year. The libraries furthermore
underlie our work to test the new multicore runtime of OCaml 5.0,
and have helped us identify a number of issues.

More information is available from the GitHub repository:

GitHub - ocaml-multicore/multicoretests: PBT testsuite and libraries for testing multicore OCaml

Happy multicore testing!

10 Likes

Interesting, this looks like a nice addition to my testing “toolbox”.
I’ve been using all of crowbar, qcstm, monolith and each has its own strenghts (what sparked my interest was in fact the presentation about qcstm from an earlier ICFP):

  • monolith:

    • :heavy_plus_sign: is like qcstm but integrates with afl with speeds up finding bugs a lot on multicore machines
    • :heavy_minus_sign: it doesn’t integrate nicely with testcase shrinking (you can use afl-tmin but there is a bit of additional machinery to integrate all that into an efficient bugfinding cycle)
  • qcstm:

    • :heavy_plus_sign: comparison with a reference implementation
    • :heavy_plus_sign: shrinking helps produce understandable and actionable bugreports
    • :heavy_minus_sign: doesn’t integrate with afl, so once you’ve run out of the low hanging fruit to fix it takes longer to find more bugs
  • crowbar

    • :heavy_plus_sign: integrates with afl
    • :heavy_minus_sign: lacks the state machine exploration/reference implementation comparison that monolith and qcstm provide

I haven’t tried it yet, but qcheck-stm looks like a nice evolution of qcstm that addresses one of its disadvantages compared to crowbar/monolith: the lack of parallelization.

Could you please clarify the connection between qcstm and qcheck-stm though? I see the README reference qcstm and some of the authors are shared, however the new qcheck-stm has a lower version number than the old qcstm.
Based on " STM contains a revision of qcstm" I’d assume that qcheck-stm is a superset of qcstm, and qcstm is deprecated in favour of qcheck-stm?

FWIW both qcstm and monolith have helped discover some bugs and security issues in oxenstored (the ocaml implementation of ‘xenstored’ in Xen) a few years ago.
My initial intent was to compare against a reference implementation, not specifically to look for security bugs, and the first bug found was a regular bug:

However fairly quickly a combination of fuzzing, stress testing and manual code review has spotted a large number of security bugs, some very obvious:

qcstm/monolith didn’t find these bugs directly, at least not initially, but they pointed out enough suspicious behaviour in that code area that tweaking the fuzzer would then find the security bug seen via manual review.

If it wasn’t for A Simple State-Machine Framework for Property-Based Testing in OCaml (OCaml 2020) - ICFP 2020 then a lot of these (security) bugs wouldn’t have been found, so thanks for helping me improve the quality of oxenstored!

3 Likes

I’m curious about shrinking for Monolith. If I understand correctly, currently the only form of shrinking/reduction proposed by Monolith is to use afl-tmin, but (according to your experience) this does not work so well. Have you talked to upstream (@fpottier) about your need for shrinking?

(Bonus question: do you think that Citrix might be willing to fund work to add shrinking to Monolith?)

In my experience shrinking is actually both delicate and a fun/interesting problem to work on. (From the library-user perspective, writing user-defined shrinking logic is also interesting and benefits from good support from the library.) Qcheck recently changed its approach to shrinking, it uses “integrated shrinking” now, and it was a lot of work.

1 Like

Thanks for your kind feedback Edwin. I didn’t know of any real qcstm users so you’ve made my day! :smiley:

qcheck-stm is indeed intended as a replacement of qcstm, which was missing the ability to run parallel tests. The interfaces required are largely the same, except we’ve had to split qcstm’s
run_cmd into two separate operations run and postcond in qcheck-stm in order to make parallel testing work. It should therefore be relatively easy to port tests from qcstm to qcheck-stm. We did so ourselves for some of the example qcstm Stdlib tests to stress test the standard library under parallel usage with OCaml 5.

qcheck-stm and qcstm are (as you point out) based on black-box property-based testing - whereas crowbar and monolith are coverage-driven (“grey-box”) driven by AFL-instrumentation. Each of these have their advantages as you also point out. In the present case, the black-box approach was relatively easy to get to work with non-deterministic parallel code in the style of Erlang’s QuickCheck and its derivatives.

In the longer term, I’m hoping that we can (eventually) unite these efforts in a full-featured, common testing library for OCaml :smiley::crossed_fingers:

afl-tmin does provide a smaller testcase, (that may be minimal from a number of operations point of view), but it doesn’t provide a “nice” testcase. E.g. it probably has some randomly generated numbers, or some ugly binary characters in a string, where the bug may also be reproducible with something simpler, like ‘0’, ‘1’, or ‘a’.
And ‘qcstm’ was a lot better at finding those nice to read testcases (obviously there could be bugs that depend on the actual value of integers/strings, not just on operation types/order, but they’re more rare).

I haven’t spoken with upstream about this because at the time I was developing code that used qcstm/monolith the actual issue being fixed was under a security embargo, so it would’ve been difficult to collaborate if I couldn’t show any examples. All oxenstored related security embargoes (that I’m aware of) have now expired and the security fixes themselves are all public as of November, so there is an opportunity for some discussion/improvements in this space now.

I’ve worked around the problem by writing a crowbar wrapper with a qcstm-style API though, which might be useful for the unified efforts?

That is a complicated question to answer at this point, the company is undergoing some transformation post the take-private/merger that was completed a few months ago, and XenServer has split off from Citrix: as you can see on cloud.com it is a sibling to it under the new corporate parent. I’ll wait until some of the internal business changes settle internally to see whether the new corporate entity/unit could/would sponsor external work, it is not as easy as you would think (it is not just a matter of finding a budget). I can try raising this again internally in a couple of months.

2 Likes