how to deal with a static analyzer output

https://stackoverflow.com/questions/2070397

20-09-2019
|

Question

We have started using a static analyzer (Coverity) on our code base. We were promptly stupefied by the sheer amount of warnings we received (its in the hundreds of thousands) , it will take the entire team a few months to clear them all (obliviously impossible).

the options we discussed so far are

1) hire a contractor to sort out the warning and fix them - he drawback: we will probably need very experiences people to do all these modifications, and no contractor will have required understanding of the code.

2) filter out the warning and deal only with the dangerous ones - the problem here is that our static analysis output will always be cluttered by warning making it difficult for us to isolate problems. also the filtering of the warning is also a major effort.

either way, bringing our code to a state when the static analyzer can be a useful tool for us seems a monumental task.

so how is it possible to work with the static analyzer without braining current development efforts into a complete stand still?

Solution

The first thing to do is tweak the heck out of your analysis settings; Coverity support probably left you with a fairly generic configuration.

Triage a representative sample of the defects, and if a checker doesn’t seem to be producing a lot more signal than noise, turn it off for now. (Most of Coverity’s checkers are good, but nobody’s perfect, and it sounds like you need to do some ruthless prioritization.)
- In the long run, turn some of those checkers back on, but mark them in your reporting as low priority. (This is harder than it should be; I’ve long argued that Coverity needs to read a couple of papers on defect ranking by somebody called Dawson Engler. :-)
- In the even longer run, try the checkers that are disabled by default; some of them find impressive bugs. And parse warnings are surprisingly useful, though you do need to turn off some bogus ones.
Be cynically realistic about which part of your codebase you’re actually going to fix soon. Use components to skip analysis on the code you’re not going to fix defects in, at least for now. (For instance, in theory, if your product includes third-party code, you’re responsible for its quality and should patch bugs in it. In practice, such bugs rarely get fixed. And if it’s mature third-party code, the false positive rate will be high.)
- Setting up components and exclusion is tricky, but once it’s done, they work well—one of my negative look-ahead regexes had over a hundred disjuncts.
- Components also help with assigning individual responsibility for defects, which I’ve found to be crucial to getting them fixed.
Set up a report for only new defects, and have people watch that URL. New defects are in active code, and it’s easier to get started with a No New Warnings policy.

Let me end with a couple of disclaimers:

You may want to re-ask this question in the Coverity support forum (http://forums.coverity.com/), which isn’t very active, but where we don’t have to worry about violating the NDA. I’ve got a list there of the checkers I found worth enabling.
I do this for a living, and maybe you want to hire us (http://codeintegritysolutions.com/); I’m giving a talk on this subject at Stanford today. Hiring a consultant to do the tuning makes a lot of sense; having somebody outside the company doing the triaging is trickier. Having an outsider do the fixes is trickier still; learning from your mistakes is even more important than fixing them.
- I’ve expanded this a bit with some parts of my Stanford talk, for our corporate blog: http://codeintegrity.blogspot.com/2010/01/handling-embarrassment-of-riches.html.

OTHER TIPS

One day a week: Turn on analysis; pick the 100 most annoying warnings; fix them; turn analysis off. In short: don't panic; your code works as it is (doesn't it?); work through the warnings in bite-sized chunks.

If you find that the same types of warnings keep reappearing (bad coding practices), educate your team to avoid them in future.

I did this with an old code base: I'd get in early in the morning (before the rest of the team), crank up the warning/analysis level on the compiler, fix some warnings and then set it back to the defaults.

For legacy code. Prioritize these leagcy bugs and come out a plan to deal with them. Balance the bug fix and new feature development. New feature is always more important.
For new code. Make it part of your integration process : before checking in the new code ,make sure they are coverity-free.

For your contractor option, you might go a more moderate route an have them only fix the issues that are clear, local and don't need a full understanding of your code. I'd guess that a high number of the Coverity hits are things like possible NULL pointer dereferences or possible writes past the end of a buffer that can be fixed with simple checks that are completely local to the code in question and need no understanding of the big picture.

I'll admit - I've done work like this before using the preFAST/preFIX or whatever the tool is called from Microsoft, and a lot of it was mechanical kind of changes. Well suited to a contractor or maybe even an intern. But there will be stuff that needs more analysis - just make sure it's clear to the contractor(s) that they shouldn't try to get to deep into things.

And be nice to them - it's drudge work, so make whatever else you can pleasant.

The coverity people told us to 'ignore' all warnings the first time you use it. Then in the next differential build, it will have incrementally new warnings: which you should fix. Then after you use the tool for a few months and you get comfortable with it you go back and start fixing the old warnings.

Try other static analyzers. I used to work with Parasoft C++ Test and it had a convenient way to filter warnings according to their severities.

Does it mean you have problems with customizing it to your needs?
As

""Customizable Analysis

Coverity Static Analysis provides the ability to fine tune analyses by modifying either the number of checkers deployed, or the settings specific to an individual checker, such as the threshold for null pointer dereferences. The ability to configure Coverity for a particular code block, or application, allows developers to select the level of performance most appropriate for their application, and leads to more accurate and reliable results. The Coverity Software Development Kit allows you to detect unique defect types in C and C++ code by creating custom checkers. This is in addition to creating custom checkers for finding concurrency, exception handling, and other critical issues.""
http://www.coverity.com/products/static-analysis.html

Typical out-of-the-box analysis configuration for Coverity will tend to give between one and three defects per thousand lines of code. If you have hundreds of thousands of defects, and you have significantly less than 100 million lines of code, I can guarantee that your analysis configuration is incorrect or suboptimal for your organization.

In addition to tuning analysis configuration itself, you can prioritize by impact - the default "high", "medium" and "low" mapping should be good enough to get you started till you get a feel for which subcategories tend to be most damaging to your application.

Third, if your codebase is large (and whose isn't) subdivide it into components so that each team or group of developers can take a look only at the code they directly maintain - this allows both more manageable chunks of work to handle, and it also lets you prioritize on component (defects in server code are more critical than defects in client code, or test code, or third party code, etc).

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow