Question

Is the sum of the cyclomatic complexity of all section in a file the total cyclomatic complexity for this file? If it is, is the sum of a set of related files the total cyclomatic complexity for this set?

This is confusing for me because most of the examples of cyclomatic complexity I've seen are from fragments of code so I've never run into an example in which the cyclomatic complexity of a whole file is calculated.

Given that my assumption is wrong, what are the proper ways to aggregate the cyclomatic complexity of a set of source code fragments?

I was pointed to this question as a possible answer What does the 'cyclomatic complexity' of my code mean? but nor in the question neither in the answers they discuss how to aggregate the cyclomatic complexity of separate code fragments.

Was it helpful?

Solution

Your assumption that the sum of CCs is the aggregate CC is correct, but perhaps not very useful.

Cyclomatic complexity is based on the control flow graph. Usually, we only look at the control flow graph of a single function. We can also look at the control flow graph of an entire program, as if all functions had been inlined into the main(). Looking at the entire program is not really useful as the CC will be unreasonably large, and conveys little useful information.

McCabe developed cyclomatic complexity in the context of unstructured programs: built around gotos or jumps rather than if/else, loops, and functions. Part of his motivation is to show that the correct use of structured and unstructured features can simplify a program. He also defines the term of a “structured program”. Essentially, he suggests: if a code snippet can be extracted into a function, we can count it's complexity as 1 – no matter what the internal complexity of this function is. As a consequence, any “structured program” that starts with a main() function would have a whole-program cyclomatic complexity 1.

As an aside, I'd like to point out that the use of exceptions technically violates this structuredness property, since a function that can throw exceptions has one entry point but two exit points: one for normal return, and one for throwing. Another interesting question is how to count polymorphic calls, i.e. OOP method calls. A polymorphic call site doesn't jump to a particular function, but dynamically selects a function at run time. Pedantically, such a call might have infinite cyclomatic complexity.

So does it make sense to speak of the combined cyclomatic complexity of a bunch of functions? Not really, at least in practice. These functions are independent. Even when they share control flow graphs because they call each other, we can treat them as structured programs and ignore the called control flow graph. What matters for maintainability is every function by itself.

However, it is fundamentally correct to simply add the cyclomatic complexities of all functions. Ignoring calls, each function has an independent entry point and exit point: a combined control flow graph exists but is disconnected. The sum of the individual cyclomatic complexities then accurately describes the number of independent paths through the code. Arguably, it would be correct to only include public functions that can be called from the outside, but not private functions.

OTHER TIPS

We are talking about the property of one unit of code as a measure of maintainability and of how hard it is to understand what it does. Applying it to a file is meaningless because just having a lot of methods in a class or having a lot of classes in a file does not necessarily impact complexity. There is no logic component in more methods, more classes or more files,

Licensed under: CC-BY-SA with attribution
scroll top