I am working on a grading script to grade students’ submission. The general idea currently is loading a submission file in toplevel and running testcases on many functions in that submission file.
I choose toplevel is due to some functions in submission may have type error. Saying one function in submission accepts only one argument but in fact it needs two arguments. If I try compiling the submission together with grading code, type error will stop the whole compiling. However, if I run it in toplevel, only this testcases fail, and the following testcases can still run, which may give them partial credits.
This is the background. My questions is:
- Is it possible to catch type error in toplevel? If I could this, I can give better feedback
- Could some test framework like OUnit help? I admit using toplevel is somewhat hacking. The tricky part is I have to be enough tolerable to submission files.
You might want to look at the work done for the OCaml MOOC:
They do precisely what you describe (and quite a bit more, in fact).
it looks an interesting and practical solution. It may take some time (did a bit more) if I need to migrate all my code.
Thanks for sharing this.
I also have a framework for grading code—not public at the moment—but it compiles the students’ code because the work that it has to do may be intensive (it is used for a numerical analysis course). It is language agnostic and judge the submission on (more or less) random input/outputs. There is not reason why we wouldn’t be able to run the code in the toplevel instead—and indeed we thought about it—but it has not been developed.
Note that there are two main problems with executing unknown code: security (reading/writing files, accessing other processes including other students submissions, detaching background processes,…) and resource consumption (filling the disk, using 100% of all CPUs,…). We solved this by using Docker and a per-submission timeout.
If you are interested in seeing the code, let me know. It is bundled with the specific code for the course right now but the project could be split and made more general if you wish to use it.
The “expect” test framework used in OCaml compiler testsuite could be helpful
For example, https://github.com/ocaml/ocaml/blob/trunk/testsuite/tests/typing-modules/firstclass.ml
your solution looks the easiest way to solve my problem. I will try it.
I agree with your points that executing unknown code is dangerous. My grading script is also running on a docker. I also use a timeout since the student code may contain infinity loop and I don’t want to solve HALT problem.