I was sitting in a project management meeting where we were discussing about various software metrics measured and their trends in our project. So I decided to jot down some of my opinions about it.
Is Software Metrics Required?
This is the first question that comes to a developers mind, and the answer is “yes”. Developing a software is similar to manufacturing a product. To improve quality of a software, we need to measure various aspects. The industry defines standard benchmark for each metric and we can see where our software stands, what needs to be done to fix if benchmarks are not being met.
Knowing What to Measure
Choosing the right metrics to measure for your software can make the difference between success and failure of a project. Deciding on right metrics will depend on lot of aspects like size, complexity, mission critical, maintainability of software. Many IT companies have a quality department which identifies what needs to be measured over the course of your project. We can prescribe additional metrics based on our requirement. Typically, it's a three-step process as listed below:
- Set the goal: This involves identifying what aspect of software we are trying to improve on. An example goal would be “Identify fault modules early in our project”.
- Ask the right questions: This involves asking the right question to ourselves as to what is preventing us from achieving the goal. Some of the typical questions for the above goal would be “Is complexity impacting the module?”, “How much testing is done on module? “, “How many code changes are happening over time?”, etc.
- Identify the Metrics: For most of the questions asked above, there are many standard metrics defined which we can use.
Commonly Measured Software Metrics
The section below is dedicated to briefly explain some of the commonly used Software metrics.
Cyclomatic Complexity
This metric indicates complexity of a program. It’s computed using control flow graph of a program. The number of nodes(n) in control flow graph minus 1 (n-1) is Cyclomatic complexity of your program. It was invented by Thomas J. McCabe, Sr. in 1976. The table below shows how risk evaluation is done based on Cyclomatic complexity.
As you can deduce from the above table, even the smallest application written taken as a whole will be have very high Cyclomatic complexity, so it's measured at function level. Industry standard is not to have any function in your application having Cyclomatic complexity greater than 10. The advantage of this is a function can have maximum test case of 10 and research has proven that most computer programmers can easily read and modify functions that are having Cyclomatic complexity less than or equal to 10 as cognitive load on human mind is less. This benchmark number can be set to higher values if team members are experienced and also working on same code-base for longer time as familiarity with code-base reduces cognitive load.
Fan-In Fan-Out
It's a structural metrics which measures inter-module complexities. Fan-in: Is the number of modules that call a given module. Fan-out: Is the number of modules that are called by a given module.
This metrics can be applied both at module level and function level. This metrics just puts a number on how complex is interlinking of different modules or functions. Unlike Cyclomatic complexity, you cannot put a number and say it cannot go beyond this number. This is used just to size up how difficult it will be to replace a function or module in your application and how changes to a function or module can impact other functions or modules. Sometimes you can put restriction on number of Fan-Out a function has to avoid cluttering your function but is not a widely accepted practice.
Cohesion
Cohesion refers to a degree to which elements of a module belong together. It's expressed as “High cohesion” or “Low cohesion”. High cohesion is preferred as it increases robustness, reliability, re-usability, and understand-ability of module. Cohesion basically decides how good your application codes are organized allowing the developer to change code confidently.
Types of Cohesion
- Coincidental cohesion (worst): Is when parts of a module are grouped arbitrarily; only relationship between the parts is that they have been grouped together (e.g. a “
Utilities
” class). - Procedural cohesion: Is when parts of a module are grouped because they always follow a certain sequence of execution (e.g. a function which checks file permissions and then opens the file).
- Communicational cohesion: Is when parts of a module are grouped because they operate on same data (e.g. A module which operates on same record of information).
- Sequential cohesion: Is when parts of a module are grouped because output from one part is input to another part like an assembly line (e.g. A function which reads data from a file and processes data).
- Functional cohesion (best): Is when parts of a module are grouped because they all contribute to a single well-defined task of the module (e.g. tokenizing a
string
of XML).
Coupling
In software engineering, coupling or dependency is the degree to which each program module relies on each one of the other modules.
- Content coupling: (high)
- Common coupling: Two or more modules share same global data.
- External coupling: External coupling occurs when two modules share an externally imposed data format, communication protocol, or device interface.
- Control coupling: Control coupling is one module controlling the flow of another, by passing it information.
- Stamp coupling (Data-structured coupling): Stamp coupling is when modules share a composite data structure and use only a part of it, possibly a different part (e.g., passing a whole record to a function that only needs one field of it).
- Data coupling: Data coupling is when modules share data through (e.g., passing an integer to a function that computes a square root).
- Message coupling (low): This is the loosest type of coupling. It can be achieved by state decentralization (as in objects) and component communication is done via parameters or message passing.
- No coupling: Modules do not communicate at all with one another.
Code Churn
It gives total added, modified and deleted LOC over a period of time. It records software change history. It can indicate how large the recent changes where. Number of consecutive edits done in your application which source files have seen large changes.
Advantages of Measuring Metrics
- Allows Architects, Project Manager and Stakeholders to control software development process and its quality. For example, a high code churn happening at the end of software cycle even though the change requests or bug fixes are simple can be an indication of poor design.
- Allows a developer(s) to customize best practices for their project. For example, an experienced team member can decide to have higher cyclometric complexity than 10 per function which is recommended to prevent from having to creating to many functions.
- By analyzing trend, the team can see how changes they are making are impacting overall software quality.
Disadvantages of Measuring Metrics
- How metrics are interpreted can have a great impact on software development, same metrics are interpreted in different ways by people and will depend on their experience. Wrong understanding of metrics can create chaos and may lead to bad quality of software as process changes made will be wrong.
- Developers can sometimes become obsessed with metrics and may try to keep in benchmark. This might lead to a situation where a developer focus is more on metrics than on solving business problems.
- Software is very complex entity and scales used are defined based on experience of different computer scientist. Sometimes a metrics may not reflect actual situation in project. In one for my projects, the average Cyclomatic complexity showed very less but the source code turned out to be having too many classes and functions making it hard to modify and make changes.
References
For software metrics content, I referred to Wikipedia and summarized the finding.