Abstract
Keywords
Introduction
Background and related work
Goals and constraints
Research questions
Study context
Motivational study: Evaluating existing approaches
Core study: Proposing new approaches
Validation study
Discussion
Threats to validity
Conclusion
CRediT authorship contribution statement
Declaration of Competing Interest
Acknowledgments
References
abstract
As incorporating software testing into programming assignments becomes routine, educators have begun to assess not only the correctness of students’ software, but also the adequacy of their tests. In practice, educators rely on code coverage measures, though its shortcomings are widely known. Mutation analysis is a stronger measure of test adequacy, but it is too costly to be applied beyond the small programs developed in introductory programming courses. We demonstrate how to adapt mutation analysis to provide rapid automated feedback on software tests for complex projects in large programming courses. We study a dataset of 1389 student software projects ranging from trivial to complex. We begin by showing that although the state-of-the-art in mutation analysis is practical for providing rapid feedback on projects in introductory courses, it is prohibitively expensive for the more complex projects in subsequent courses. To reduce this cost, we use a statistical procedure to select a subset of mutation operators that maintains accuracy while minimizing cost. We show that with only 2 operators, costs can be reduced by a factor of 2–3 with negligible loss in accuracy. Finally, we evaluate our approach on open-source software and report that our findings may generalize beyond our educational context.