I think most people are now using branch coverage as a quality metric over statement coverage, but one metric I've not seen much about is: the quality of the test itself.
For example, I could write tests which exercise many of the branches in my code, but none of the tests do an assert. So while I've executed a lot of my branches, I've not checked the return conditions properly. Is there any way to capture this "assert" metric?
Are people using any metrics on the tests themselves?
The blog post What does code coverage really mean? deals with this question. The study results indicate that the code coverage of unit tests is in general a good indication for the regression test reliability. For system tests (which execute large portions of a software system) the code coverage is not a useful approximation for the reliability.
Mutation testing can be used to evaluate the effectiveness of test cases. The idea is to mutate the source code by introducing faults and to check whether the test cases are capable of detecting the faults. The usual approach is to apply a mutation operator (eg: remove a line of code, replace an addition with a subtraction, invert a boolean condition) on a single method, to run all tests and to check if at least one of the test cases fails. The test cases which fail were able to reveal the broken code. The downsides of mutation testing are its computational complexity and the problem of equivalent mutants distorting the results (equivalent mutants are code chunks which were syntactically mutated but remained semantically unchanged). Pitest is a mutation testing system for Java which is used in industry.
Concerning test cases that do not contains any assertions, Martin Fowler writes:
Although assertion-free testing is mostly a joke, it isn’t entirely useless. [...] Some faults [such as null pointer exceptions] do show up through code execution.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With