Does Defect density measure QA or DEV effectiveness?

https://softwareengineering.stackexchange.com/questions/379614

13-02-2021
|

Domanda

Most Defect density metrics are like this:

Number of bugs / Size (KLOC, Story Points, Function points..)

I have read that it measures the effectiveness of QA but I do not get it:

I can have a super senior developer and a sloppy QA. Therefore, the sloppy QA guy finds nothing because there is nothing.
I can have a bad developer and a good QA guy, who finds most of the issues.

Therefore, how this could be measuring the QA? In my view, it reflects both quality of dev and effectiveness of QA.

Soluzione

I have read that it measures the effectiveness of QA but I do not get it

It is surely a good idea to take any such statement about metrics with a grain of salt.

The only way I can think of to use the given metrics to measure the effectiveness of QA, is to take the same piece of code, give it to different QA people and let them independently test it. In theory, the more bugs they find, the better the QA (however, in reality the severity of the bugs found should also be incorporated).

Of course, by doing so the size of the code is fixed, so using the density as a number is quite pointless, you could actually just use "Number of bugs found by QA" as a metrics.

As you correctly have noted by yourself, as soon as you vary the QA persons and the subject under test, the metric does not tell you anything reliable about the effectiveness of the QA any more.

Since you did not give any reference where you found this questionable statement, I did a quick web search to see what others wrote about this metrics. It brought me to this page, mentioning two uses for the defect density metrics:

For comparing the relative number of defects in various software components so that high-risk components can be identified and resources focused towards them.

For comparing software/products so that quality of each software/product can be quantified and resources focused towards those with low quality.

"Effectiveness of QA" is nowhere mentioned. So according to this source, defect density is a metrics for quantifying quality aspects of the software, not of the development or QA process.

(And if you read something different in a text book, you better ask the authors of the book what they had in mind with their statements.)

Altri suggerimenti

Defect density measures the number of defects found per unit of code measurement (never ever use lines of code for this), for a given period of time. The higher the number, the greater the density of bugs.

The density of bugs can be taken to be an inverse measure of the quality of the app. The lower the density (ie, the less defects reported), the higher the quality of the code.

But being a simple metric it has flaws:

If I simply do not test the code, my defect density is zero: perfect quality! Of course in reality, most untested code is far from good quality.
Conversely, for the same piece of code, I would get way more defects reported by super-zealous testers than I would for a lacklustre dev asked to do some testing. So I now have two densities (and thus two quality measures) for the one piece of code. And since I'm judging those testers on the number of defects raised, I'm incentivising them to score the quality as low as possible.
And of course, there's the issue of early testing. Pair a developer and tester during a sprint, and the output will likely be low in bugs. But it's extremely unlikely that the defects they found during the sprint will be recorded, so I've no measure of quality unless I retest just to gain my metric.

Defect density is just a metric. To be able to read more into it (quality of code, effectiveness of testing, likelihood of the app containing significant bugs etc) requires a heavy dose of subjectivity. Unless you know how effective your testing is, defect density won't be a reliable quality measure for example. But if you can't use metrics to measure effectiveness of testing, how do you measure it? So a subjective measure is needed.

It was probably refering to defects found after release. ie by users not QA.

In this case, it is a measure of QA effectiveness as poorly coded work would be easier to find bugs in and hence not release. Very well, but not perfect code, might have a couple of minor, hard to find bugs slip through, but not many.

Of course it could also measure project manager-careing-more-about-deadlines-than-bugs-ness

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a softwareengineering.stackexchange