Reasonable test coverage requirements when dealing with a contractor?

https://softwareengineering.stackexchange.com/questions/16159

22-10-2019
|

Question

We're outsourcing some work to an external developer, so I'm busy writing up a contract about what constitutes a deliverable.

So far I require that the code is shipped with automated tests.

But, what is a reasonable way to specify the detail of tests up-front in the contract in a measurable way?

I'm loathe to say "100% code coverage" because it's been established pretty often that 100% is pretty meaningless, and the diminishing returns above about 70-80% would probably just be pushing up our costs unnecessarily, and possibly even pushing up the complexity of certain things that might otherwise be very simple.

Internally we pretty much leave it up to our developers to decide on the level of tests needed, based on their intuition and experience. With a contractor however there is a fixed price that has to be agreed to up front and we need some way to enforce a certain level of quality.

Any suggestions or recommended reading matter would be appreciated!

Solution

When subcontracting out, it is up to you to ensure the code being written at least works the way you need it to. For that reason, your team will need to write some automated acceptance tests. Provide those tests to your subcontractor, so they can make sure their code works with it.

Anytime you require percentage coverage in your unit tests, it is up to you to provide the tool which will be measuring the code coverage. I don't know the environment you are running (.Net, Java, Ruby, etc.), but there are usually more than one tool available to measure coverage, and they are not all equal. You also need to specify, or at least agree to the parameters used (i.e. coverage exclusions, type of coverage, etc.).

It would be unfair and unproductive to require testing of:

Generated classes/methods (some ORM tools generate classes, .Net UI components generate classes and methods, etc.)
System level exception catching code. The code may be required by the language, and good practice, but if testing it requires hacking the platform itself, it's not worth the investment.

Don't require more of your subcontractors than you would of your own team. If you are going to require a certain percentage of unit tests as an acceptance criteria, provide a range like 70-80%. If they beat it, great. I would consider 50% coverage an absolute minimum, with 70% a decent requirement. Anything above 70% may cost more, but you'll have better piece of mind about it.

Just a note about metrics like test coverage. They are just numbers, and anyone can play with numbers. I think your intent is a good one, but anyone who wants to game the system can. The coverage number is a rough indication of the thoroughness of the testing, but not the quality of the testing. In my experience, many programmers who are not used to writing unit tests tend to write integration tests, and merely run the application through the test framework without any assertions whatsoever. Essentially they are just providing themselves a launching point to step through with a debugger. It takes time and training to get unit tests that are useful.

I would require an early initial delivery simply to evaluate the effectiveness of their unit testing, and to help fine tune both your expectations and theirs. That will help both of you to get on the same page, and make future deliveries better.

OTHER TIPS

Testivus on Test Coverage -- From the Google Testing Blog:

Early one morning, a young programmer asked the great master:

“I am ready to write some unit tests. What code coverage should I aim for?”

The great master replied:

“Don’t worry about coverage, just write some good tests.”

The young programmer smiled, bowed, and left.

Later that day, a second programmer asked the same question.

The great master pointed at a pot of boiling water and said:

“How many grains of rice should put in that pot?”

The programmer, looking puzzled, replied:

“How can I possibly tell you? It depends on how many people you need to feed, how hungry they are, what other food you are serving, how much rice you have available, and so on.”

“Exactly,” said the great master.

The second programmer smiled, bowed, and left.

Toward the end of the day, a third programmer came and asked the same question about code coverage.

“Eighty percent and no less!” Replied the master in a stern voice, pounding his fist on the table.

The third programmer smiled, bowed, and left.

After this last reply, a young apprentice approached the great master:

“Great master, today I overheard you answer the same question about code coverage with three different answers. Why?”

The great master stood up from his chair:

“Come get some fresh tea with me and let’s talk about it.”

After they filled their cups with smoking hot green tea, the great master began:

“The first programmer is new and just getting started with testing. Right now he has a lot of code and no tests. He has a long way to go; focusing on code coverage at this time would be depressing and quite useless. He’s better off just getting used to writing and running some tests. He can worry about coverage later.

The second programmer, on the other hand, is quite experience both at programming and testing. When I replied by asking her how many grains of rice I should put in a pot, I helped her realize that the amount of testing necessary depends on a number of factors, and she knows those factors better than I do – it’s her code after all. There is no single, simple, answer, and she’s smart enough to handle the truth and work with that.”

“I see,” said the young apprentice, “but if there is no single simple answer, then why did you tell the third programmer ‘Eighty percent and no less’?”

The great master laughed so hard and loud that his belly, evidence that he drank more than just green tea, flopped up and down.

“The third programmer wants only simple answers – even when there are no simple answers … and then does not follow them anyway.”

The young apprentice and the grizzled great master finished drinking their tea in contemplative silence.

You can come up with more specific measure. E. g. 100% coverage for methods with cyclomatic complexity >= 5. Etc...

If you want a number, 80% is often used. In fact, where I work presently our continuous integration server (Hudson) displays a yellow light for any project on which the test coverage is below 80%.

The challenge here is that ensuring that a certain percentage of the lines are covered by tests is very different from ensuring that the code is tested in a way that leads to better maintainability.

For one thing, a code-coverage tool can be fooled by a test that exercises the code and ends with Assert.assertTrue(true).

The less malicious concern is that a programmer that doesn't know how to write good tests will not write tests at the appropriate level of granularity, which can lead to a situation where future changes require major changes in the tests, and the tests become a burden to refactoring, rather than a help.

So if I wanted to give a number, I'd use 80%. But this number is only useful when dealing with an honest developer that knows how to write good tests.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange