How to quantify Code Quality [closed]

https://softwareengineering.stackexchange.com/questions/400913

04-03-2021
|

Question

My Business unit has huge number of developers (About 100). Due to various reasons, the developers are nearly solely focused on delivery rather than quality. The bad quality code has already started hurting us. The bugs have become difficult to identify. The code has become fragile. The unit tests are quite disconnected from the requirements. And so on.

I am closely working with the management. At this point I want to want to device strategies to reward good code. I would like to come up with methods that objectively measure few parameters listed below. The reason I want to measure this is to create an environment where developers feel recognized and appreciated when they write good quality code. (I can get management support for this). At this point the management is disconnected with out numbers for measuring the quality of the code.

Code readability
The degree to which the unit test correlates with requirements ( not coverage, developers are able to cover the code with meaningless unit tests)
The simplicity of the design

We are mostly developing in C# (mobile, web and desktop apps)

Update 1: 14th November 2019

Based on the brilliant answers, I understood that there are numerous measures to measure quality and they all have to be taken with a pinch of salt. It was a brilliant insight by Telastyn. "When a measure becomes a target is ceases to be a good measure". This nailed it for me.

At this point, I am convinced that we have to consider the human element in code quality (especially for readability). How to get the management interested in the measuring the of code quality (subjectively and objectively) is really the challenge I am facing.

Solution

At this point I want to want to device strategies to reward good code.

You cannot. Goodhart's Law will quickly come into play, and your objective metrics will become the things that your developers focus on to the exclusion of all else.

If your management is disconnected from the actual stuff you're producing, or doesn't trust your team leads who do have insight into the code then metrics aren't going to fix that. You have a management problem, not a code problem.

OTHER TIPS

Basics

There are many ways to measure quality. None of them are perfect: if start thinking in absolute terms, you can create new problems. Here are a few metrics that are commonly used:

Cyclomatic Complexity
Test coverage
LOC per class/method
Duplication counts

There are also binary measures that can be used as 'gates' such as conformance to standards or minimum test coverage.

Instead of striking out on your own, I recommend trying something such as SonarQube. With this you get a whole slew of standards and metrics that you can build into dashboards.

I'd recommend putting your code through and do some analysis on the results and sharing them with your management. Then you can consider tweaking some rules and use this to break builds; assuming you have a build server and managed process for delivery. This part will require management buy-in, however.

Naming

Depending on what your ideas are about naming, this might not be doable. I'm not sure there's a way to programatically determine if a variable is named properly. I can think of things

length of names (too long, too short)
digits in names (bad sign)
they should be dictionary words

But ultimately, you probably need a review team to make sure the names are meaningful. Does anyone on the team agree with your concerns?

Testing

This might be a little controversial, but I would argue that unit tests don't need to be connected to the requirements. The point of unit tests is not to make sure the code does what it's supposed to do. The point is to check that the code does what you meant it to do. For example, your test for a sum function should make sure that it calculates sums correctly. It's irrelevant in the unit test whether the requirement is to calculate a sum. This is a subtle distinction but it really makes a big difference in how you write unit tests.

Testing for requirements should happen in a different layer. If you are using unit tests for this, you should put together functional and regression testing suites at a component or application level.

The challenge with unit testing metrics, beyond coverage is whether they are meaningful. I think using branch coverage is an improvement over line coverage but if a test can simply execute a path and not check anything.

One idea that comes to mind is that you might able to use mocking or fuzz testing tools here. Take the unit tests and run them against bad method definitions. Good tests should fail in that scenario.

Visibility

It's important that all code base is shared and visible to the team.
Promote peer-review.
Promote pair programing.
Use a static code analyzer, this tools are configurable and come with a preset set of rules that can be tweaked.
Integrate the SCA to the continuous integration workflow and publish automatic semaphore reports visible to anyone.
Although code quality is not 100% measurable, having visible reports measuring length of classes and methods, cyclomatic complexity, cohesion, and a whole plethora of parameters, along with the unit test results, and then turning it into a green, yellow or red light, will give programmers an idea of whether or note their code is clean and summing up to the overall quality of the code base and not the opposite.

But for me, the magic word is visibility. People in general tend to behave better when observed and programmers are not the exception. Code should be something to be proud of, something to talk about over the water cooler. Talk code, breathe code and think code.

You also say something about having unit test not testing that requirements are met, even with a 100% coverage. I'm not sure about this, requirement or stories testing are a matter of acceptance tests or integration tests, so much that you can't necessarily infer the names and amount of methods written to satisfy a certain functionality. Some unit tests can though test whether a function, given certain inputs, returns the expected result, and that can be done against a whole list of thousands of pre-computed values. Code that can't pass it is broke. I think that's indeed enforcing the fulfillment of requirements. There are other requirements, though, that are the matter of integration tests, stress tests and acceptance tests.

BLUF: Use whatever tools you can to measure and improve quality, but understand their limitations. In your example scenarios, it appears you need some standards for peer reviews.

Quality is by definition a fuzzy concept:

The standard of something as measured against other things of a similar kind; the degree of excellence of something. (Google definition)

While in computer science we provide quantitative numbers to qualitative concepts all the time (relevance scores, likelihood of match, etc.) those scores are rarely in a form that can be compared with each other outside of the particular set of data it was run across.

Code quality is a very complex subject, and to ignore the human factor that separates code that blindly follows "best practices" and code that is truly high quality is to ignore what you really want. There are different tools that we have to improve quality and consistency, but none of them can authoritatively say whether code is quality or not.

For example:

Cyclomatic complexity: is a measure of the code branches that can happen, it is most useful at the function or method level. The more branches and loops there are, the more difficult it is to follow logic in that function.
Maintainability Index: (math) is a composite score of the cyclomatic complexity, source lines of code, and something called the Halstead volume. Probably one of the better metrics, but it cannot cover readability or understand-ability of the code. Treat the value like a percentage.
Lint or Static Analysis tools (like SonarQube): Provide a number of "best practices" codified into rules. Code that follows those rules is usually easier to follow and maintain than code that does not. Typically the rules need to be customized toward your product, but if everyone at least formats their code the same it helps readability.
Unit and Business Tests: help measure if the code is fit for a particular purpose. The presupposition here is that the tests are themselves written in a way that actually fail if the conditions warrant it. (I can't tell you the number of times I have discovered a unit test was simply meant as a way to start the debugger and had no assertions)
BDD (Behavior Driven Development): creates specifications that can be instrumented to ensure the code satisfies those requirements. Typically used from the user interface itself.

While these tools can help you improve different aspects of your code there are a number of things they simply cannot evaluate. Those things also directly impact the quality of the code:

Suitability of names used in the code (i.e. method, variable, class names)
Evaluation of conceptual consistency (i.e. Single Responsibility Principal, etc.)
Suitability of tests for correctness
Suitability of algorithms to solve problems
Conceptual consistency

I am skeptical about the ability to quantitatively measure quality; however, the attempts to do so have provided us with good tools to measure aspects of quality. That's why we need a human to look at the code and ask questions.

I recommend using quality scores like a meter. Meters only tell you if you are in operating norms, and looking at their values over time can show you trends. When they are outside of norms, you know there is something that needs to be investigated. Additionally, if the trend is toward an unfavorable state, you can take corrective actions before the code goes too far that way.

This all sounds pointless to me.

Due to various reasons, the developers are nearly solely focused on delivery rather than quality

So this is your problem but you would rather focus on some new rules?

Developers either do not care enough to write maintainable code or they feel they do not get the time to finish anything beyond the point at which things sort of work, in the most likely scenario, for the moment. In the first case nothing is going to help. In the second case your macro-process is to blame and no micro-process is going to fix it.

It all comes down to just two things:

capable, caring and assertive developers
reasonable expectations at the management level

If you have these, the rest will follow naturally. If you don't, you're toast anyway.

I am closely working with the management

Good, so you can make sure no ad hoc tasks are committed to and enough time is attributed to new features.

tl;dr Focus on sustainability, not metrics.

You're in business, so your end goal is long-term success: making lots of money, creating products you're all proud of, improving the world in some way. A business does this reliably by creating value for customers. And in your particular case this means building quality software. But hold on! Customers care about product quality, not code quality. Code quality is neither necessary nor sufficient to create something people want. Just look at the number of professionals admitting that some popular piece of software they worked on was a giant ball of mud, or the rare piece of crafted software which gets nowhere because it doesn't solve the right problem. The crafted non-solutions never get to the market in the first place, and the big balls of mud often get trounced when leaner companies overtake them. (Although some big balls of mud make a lot of money by being first, so the company can pay down that technical debt to stay lean enough.)

If anyone said they could sum up your achievements in a single number, you would laugh at them. Nobody is "80% successful at life." Why would quality metrics be more useful? Any single number (and a weighted sum of several metrics is also just a single number) is not going to tell you how well the software will do. Don't get me wrong, quality assurance is one of the most important ways to make sure you are actually able to produce workable software at all, but nothing so simple could possibly be a silver bullet in our complex world.

Want employees focused on their salary, who will never be invested in the success of the company? Give them metrics to blindly optimize for. Programmers are good at that. You want employees focused on the long-term success of the company? Give them a share in the company and enable them to produce useful software.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange