Monday, March 31, 2008

Refactoring and Cyclomatic Complexity

"Refactoring" by Martin Fowler is a great book that gives formal definition to a huge mishmash of informal programming procedures and habits. It not only argues the benefits and "whys" of the whole refactoring game (which are often informally understood and championed by many), but also gives formal definition to the types of code decay most often occurring as well as formal definitions and names to the possible solutions.

Fowler lovingly refers to this code decay as "code pungency" or "bad smells in code" when encountered by a developer. However in 1976 Thomas McCabe coined the term "Cyclomatic Complexity" in a paper outlining a software metric used to analyze the complexity of programs. The metric measures the number of linearly independent paths through a given chunk of source code.

"Cyclomatic complexity is computed using a graph that describes the control flow of the program. The nodes of the graph correspond to the commands of a program. A directed edge connects two nodes if the second command might be executed immediately after the first command." [wikipedia]

Basically as the number of conditional statements in a program grows, so do the number of corresponding paths and therefore the graphical representation of the program's control flow. A program with no if/else statements has a single path. Add in one 'if' statement and the number of paths grows to two. One path if the 'if' returns false, and another if it returns true. Any function call including some sort of boolean assessment will fork the program's path, not to mention 'while' and 'switch/case' statements, ad infinitum.

The software metric of cyclomatic complexity serves as a baseline indicator for when a program needs refactoring. From a top level viewpoint the analysis is useless because all programs must include some degree of complexity in order to be useful. Where the measurement become useful is at the method level where methods including some 30 or more linearly independent paths are clearly marked as overly complex and should be scheduled for refactoring.

Several tools exist which do good jobs of measuring this metric:
[JavaNCSS]
[Dependency Finder]

One which includes an excellent visual output of the state of a project's complexity is Panopticode, which also provides visual output of your given level of code coverage as a bonus.

Example: The complexity state of the Cruise Control 2.6 project (hit refresh until it loads).

All in all refactoring is an extremely necessary concept for every programmer to understand, even those who practice it now stand to gain a great deal by understanding why they are doing what they do. I hope to find other refactoring tools to aid in the developmental process soon.

No comments: