Separate code coverage reports for unit and integration tests, or one report for both?

https://softwareengineering.stackexchange.com/questions/264666

06-10-2020
|

Domanda

Should there be a separate code coverage report for unit and integration tests, or one code coverage report for both?

The thinking behind this is that code coverage allows us to make sure that our code has been covered by tests as far as possible (as much as a machine can now anyway).

Having a separate report is more convenient for us to know what has not been covered by unit tests, and what has not been covered by integration tests. But this way we cannot see the total coverage percentage.

Soluzione

Above all, you need to have and analyse combined (total) coverage. If you think of it, this is the most natural way to properly prioritize your risks and focus your test development effort.

Combined coverage shows you what code is not covered by tests at all, ie is most risky and need to be investigated first. Separate coverage reports won't help here, as these don't let you find out if the code is tested somehow else or not tested at all.

Separate coverage analysis also can be useful, but it would better be done after you're done with combined analysis and preferably would also involve results of analysing combined coverage.

Purpose of separate coverage analysis differs from combined one. Separate coverage analysis helps to improve design of your test suite, as opposed to analysis of combined coverage which is intended to decide on tests to be developed no matter what.

"Oh this gap isn't covered just because we forgot to add that simple unit (integration) test into our unit (integration) suite, let's add it" -- separate coverage and analysis is most useful here, as combined one could hide gaps that you would want to cover in particular suite.

From above perspective, it is still desirable though to also have results of combined coverage analysis in order to analyse trickier cases. Think of it, with these results, your test development decisions could be more efficient due to having information about "partner" test suites.

"There's a gap here, but developing a unit (integration) test to cover it would be really cumbersome, what are our options? Let's check combined coverage... oh it's already covered elsewhere, that is, covering it in our suite isn't critically important."

Altri suggerimenti

You don't mention your testing tool. Many have "combine" functions that let you aggregate the results of multiple runs or suites. If you want an aggregate coverage metric, explore the combine feature in your coverage tool.

Now, can we talk about the elephant in the room?

There is no spoon. And there is no "total coverage percentage." At least, no simple one.

Coverage percentage is a readily-comprehended metric presented to help understand the scope, depth, and range of testing suites. But like any simple benchmark, it's very easy to become target fixated on this value as some sort of magical talisman of "complete testing."

Let's say you have achieved the glory of "100% test coverage." Yay! But what does that mean? 100% of code lines are tested, right? Then what about this line?

launch_missile = launch_authorized and launch_cmd_given else previous_launch_status

"Covering" that line means something--but not a whole lot, because there are a variety of conditions which are True or False with some probability, but it's unlikely that you have tested all of the combinations of those conditions. Even if that line is covered a dozen times, if one of the conditions is relatively uncommon, you haven't come close to testing all of the real results that might occur in practice. To make that clearer, a more synthetic example:

engage_laser = (laser_armed and safety_disengaged) or random.random() < 0.0000003

How many times would you have to cover that line to really exhaustively test it? How many times would you have to cover it to test it in combination with all of the other variables in the program (with their own, possibly similarly rare) probabilities?

I'm not saying that coverage metrics are useless. They're actually great. They focus on one of the key issues: How extensively is my software system tested? They help move from "we have some tests" to "we have thoroughly tested."

But while you're working on "combined scores," the reality is that your score will typically be for "statement coverage" rather than "condition," "predicate," or "path" coverage. So whatever number your aggregate scores give you, it's unlikely that it's giving you a true picture of how much of your program potential states and state combinations are being tested. While you're working on increasing your coverage percentage, consider also measuring your predicate coverage. It will give you a more realistic--and almost invariably, a more sobering--view of test extensiveness.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a softwareengineering.stackexchange