Question

Possible Duplicate:
What is a reasonable code coverage % for unit tests (and why)?

I am in the middle of putting together some guidelines around unit test code coverage and I want to specify a number that really makes sense. It's easy to repeat the 100% mantra that I see all over the internet without considering the cost benefit analysis and when diminishing returns actually sets in.

I solicit comments from persons who have actually reported code coverage on real-life, medium/large-sized projects. What percentages were you seeing? How much is too much? I really want some balance (in figures) that will help developers produce hight quality code. Is 65% coverage too low to expect? Is 80% too high?

Was it helpful?

Solution

When you mix code coverage with cyclomatic complexity, you can use the CRAP metric.

From artima.com:

Individual Method Interpretation:

Bob Evans and I have looked at a lot of examples (using our code and many open source projects) and listened to a LOT of opinions. After much debate, we decided to INITIALLY use a CRAP score of 30 as the threshold for crappiness. Below is a table that shows the amount of test coverage required to stay below the CRAP threshold based on the complexity of a method:

Method’s Cyclomatic Complexity        % of coverage required to be
                                      below CRAPpy threshold
------------------------------        -------------------------------- 
0 – 5                                   0% 
10                                     42% 
15                                     57% 
20                                     71% 
25                                     80% 
30                                    100% 
31+                                   No amount of testing will keep methods    
                                      this complex out of CRAP territory.

No amount of code coverage is going to guarantee "high quality code" by itself alone.

From the comments...

It's definitely too lax to give simple methods a pass on coverage. What you will likely find when implementing this on existing code is that the code coverage will rise as you're refactoring those ugly methods (code coverage should rise otherwise you're refactoring dangerously).

The 0-5's are essentially low-hanging fruit and the ROI isn't all that great. That being said, those methods are wonderful for learning TDD because they're often very easy to test.

OTHER TIPS

Personally I would go for 80% coverage, but of course this is only relative... I personally didn't achive this yet, too.

Currently we have very high coverage (99%) on our utility classes, which is good because bugs in there will hunt you through your whole application.

Mediocre coverage is for most GUIs, because writing tests for them is hard and time expensive, so we often leave it to opening the gui in the unit tests and if there is no error we close it automatically.

I don't think you can really have too much code coverage. I think you need to determine what code runs the "regular course of business" in the application and have that completely covered. For the remaining code that isn't in the normal course of business, start whittling that down by doing the most critical first. Abnormal business that isn't terribly important has low gain for getting good code coverage on it.

The only correct answer is you test as much as you can afford. Obviously, this is an axiom across every engineering project.

Beyond that, it's all subjective and highly dependent upon the project at hand. For example, the flight control systems lockheed puts out had better be tested more than 80%, but 80% may suffice for my GUI front-end to an XML viewer.

Typically, you break down the cost of running tests with your team. In the theory world, it is customary to have man-hours as a result of the question: how much testing can we afford?

After this, you examine your modules and you determine which parts of the code have the most time spent in them. Each critical module should be covered once. From here on, you give an appropriate number of tests compared to the amount of time specific modules are executed. So in the end, there's no hard number of "X%" is covered.

John Musa has a really interesting book on the subject.

On the program that I'm on (~500k SLOC), we use 100%. That is a program requirement to proceed to the next phase of testing. Here are the reasons behind it:

  1. The program is used in some safety critical situations, and you don't want any off nominal conditions to not be tested

  2. If you aren't hitting 100%, then you either wrote code that isn't necessary, and are hence wasting money, or you aren't testing your off nominal paths completely. See #1.

  3. Your unit test scenarios should naturally get you close to 100%, regardless of the actual program code coverage metric you're using. If someone is at 95% based solely on their off nominal scenarios, requiring 100% isn't onerous (and, again, you should be asking why you aren't at a 100% then. See #2.)

Your mileage will certainly vary. If you aren't working on a mission / safety critical application, than you probably don't need to be worrying about your code coverage as much - however, I'd have to ask again: why are you writing code that you don't need?

[Edit]

Based on the comments I've received below, I should clarify. The program guideline is 100% code coverage for unit tests. That development process requirement can be waived if, for a technical reason, a branch of code cannot be reached (protected default constructor that is never called, etc.) Approval is usually granted from an external, independent portion of the organization (go go SQA).

From an integration / systems test, code coverage becomes moot, as you start looking at requirements coverage. That's a different ball of yarn altogether.

The original question was looking for real world situations: I agree that not (most?) all real world situations will warrant 100% code coverage on a unit test level, but there are certainly cases that do, and programs that do. And it is a habit of some developers to write code that they don't need, which then ends up untested. This becomes a maintenance nightmare, as a latter developer will call methods that were never "meant" to be used (or were included because someone thought they were a "good" idea). Shooting for 100% coverage forces you to answer the question "why did I write this?"

It really depends. I know a lot of software that goes 0%. I have a lot of software that has single digit %. The main question is what really is needed, and wanted in financial terms.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top