Software Quality Metrics
Author: Zijian Yuan
Professor: Dr. William J. Hankley
 

ABSTRACT
This paper describes the different types of metrics that relate to Software Quality and focuses on Software Product Metrics. Current techniques and commercially available tools, included those that are used for Object Oriented Design and Programming, are surveyed. A brief tutorial on some of the more traditional metrics.  


Software Quality Metrics
 
 
 

SOFTWARE QUALITY METRICS
1 - Introduction

Before we start and focus on Software Product Metrics it is important to make clear some important points about Software Quality itself.

Software Quality is conformance to explicitly stated functional and performance requirements, explicitly documented development standards, and implicit characteristics that are expected of all professionally developed software. Furthermore, we should distinguish between software product quality and software process quality as illustrated in the following diagram:

 
 
 Software Quality 
   |        | 
   |        | 
   |        +---> Product Quality : Attributes of: Documents / Designs / Code / Tests  
   | 
   +---------> Process Quality : Attributes of: Techniques / Tools / People / Organizations / Facilities 

 Source: Deut88, p.10

 
Product quality describes the attributes of the products of the software process. It would then include, for example, the completeness of the design documents, the traceability of the design, the reliability and maintainability of the code, and the coverage of the tests.

Process quality, on the other hand,  describes the attributes of the software development process itself. Five software environment elements that are present in all of the software project are taken into account, they are: techniques, tools, people, organization, and facilities. Process quality then would focus on, for example, the correct implementation of a technique, the productivity of a tool, the abilities of the programmers, the communicativeness of an organization, and how well suited are the installations and facilities.

Identifying Quality Metrics

Software metrics are any measurement which relate to a software system, process or related documentation. Following the same criteria as with Software Quality, Quality Metrics can be grouped according to their functionality in product, process, and people metrics.

Product metrics quantify useful attributes of the products you are generating, including those used for internal consumption. They help you assess if the product is good enough through reports on attributes like usability, reliability, maintainability and portability.

Process metrics quantify useful attributes of the software development process and its environment. They tell you if the process is functioning optimally as they report on attributes like cycle time and rework rate. The goal is to do the right job on first time through the process. Metrics give you feedback needed to achieve this goal.

People metrics quantify useful attributes of those generating the products using the available processes, methods and tools (infrastructure). they tell you if your people are being productive or if you have personal problems as they report on attributes like turnover rates, productivity, and absenteeism. The goal is to keep staff happy, motivated and focussed on the task at hand.

For this project we will focus on Software Product Quality, especially on metrics that apply to it.

[Back to Top]
 


SOFTWARE QUALITY METRICS
2 - Software Product Metrics

From the early times when the firsts programs where coded and the first projects implemented, the people involved in this process have been concerned with the ideas and application of measurement. By measurement, we mean, quantified observations on some attribute or aspect of the software product.

While any measurement provides us with some additional knowledge, what ultimately distinguishes the relative small number of "useful" measurements from the vast array of possible measurements is the degree of insight obtained and the usefulness of the information balanced against the cost and effort that takes to obtain them.

Software product metrics help us in various ways:

2.1 Metrics Categories

According to their functionality, they are categorized as:

Examples of each one is given in part 3 where they are also classified by type.

2.2 Metrics Types

 
2.2.1 Token metrics

Token metrics are calculated by counting tokens in the source code of a system, program or program unit. A token is a simple entity which makes up a program; for example, if, for, and usr_name are tokens in C: the first two are key words while the third is a variable identifier. Some of the most well known token metrics are the one proposed by M.H. Halstead who was professor at Purdue University.

2.2.2 Control flow metrics

With control flow metrics, each program unit in a software system is explored and the flow of control determined. Graphical notation is used to visualize the control structure complexity of the unit being analyzed. The original work on Cyclomatic Complexity, one of the first control flow metrics know, was supplied by Thomas McCabe in the mid 70s.

2.2.3 Composite metrics

Many researchers have pointed out that metrics solely based on control structure complexity are inadequate, that although control structure complexity has an obvious effect on process metrics such as time to debug, other factors are important. Hansen and Oviedo, for example, proposed hybrid metrics that combine control flow metrics with token metrics in order to overcome the weaknesses of simple metrics.

2.2.4 System metrics

The previous categories of metric, token-based and control flow, concentrate on measuring the property of program code. System metrics measure large-scale properties of a system, usually the quality of a system design. They are considered the most useful because they can be extracted during an early phase of a project, system design, and hence are capable of predicting more. Much of the work that has been carried out on system metrics has been inspired by Christopher Alexander's book "Notes on the synthesis of form".  In this work he demonstrated that the quality of a design of an object, be a refrigerator, building, electronic circuit or bridge, was intimately related to the degree of interconnection that occurred in the object; it is this degree of interconnection that system metrics are intended to measure.

[Back to Top]
 


SOFTWARE QUALITY METRICS
3 - Traditional Metrics

3.1 CRITICAL CODE MEASURES

3.1.1 Token Metrics

Halstead metrics :
Token metrics, as we have already seen, are calculated by counting tokens in the source code of a system, program or program unit. A token is a single entry, which makes up a program; for example key words and variable identifiers are tokens. In 1977 Maurice Halstead published a book called "Elements of Software Science". The book described various code-based measurements. They are summarized next.

By obtaining the following values:

n1 number of distinct operators in program
n2 number of distinct operands in a program
N1 total number of operator occurrences
N2 total number of operand occurrences
We can calculate: (Source: Hatz93, p.122 )
Length:
N = N1 + N2
Vocabulary:
n = n1 + n2
Predicted Length:
N = (n1 x log2 n1) + (n2 x log2 n2)
Program Volume:
V = N x log 2 n
Effort
E = (n1 N2 log2 n) / 2 n2
Time:
T = E / P 
(P = Productivity factor between 5-20)
Predicted Bugs:
B = V / 3000
Example:

1. source code:
        Z = 0;
        while X > 0
        Z = Z + Y ;
        X = X - 1 ;
        end-while ;
        print (Z) ;

2. count:

-operators:                                    --operands

=                       3                                Z                    4
;                        5                                0                    2
while-endwhile   1                                X                   3
>                       1                                Y                   2
+                       1                                1                    1
-                        1
print                   1
()                       1

then:  n1 = 8 , n2 =5, N1 = 14, N2 = 12.

3. We can calculate:

Length:                       N = N1 + N2 = 26
Vocabulary:                h = ?1 + ?2 =13
Predicted Length:        N = (?1* log ?1) + (?2 * log ?2) = (8 * log 8 ) + (5 * log 5 ) = 35.5
Program Volume:        V = N log2 ( ?1 + ?2 ) = 96.2
Potential Volume:        V* = (2+ ?2*) log2 (2+ ?2*) = 8
Implementation level:    L = V*/V = 0.083
Effort:                          E = V/L = 1159
Time:                           T = E / P = 64(sec)
Predicted Bugs:            B = V / 3000 = 0.003

Tool
 

3.1.2 Control flow metrics

MacCabe's Complexity Metric:
Proposed by Thomas McCabe, it is based on control flow within the code. A program graph is used to depict control flow. Each circle represents a processing task and directed arrows represent flow control. McCabe defines (and uses) the cyclomatic complexity from the graph.
Cyclomatic complexity increases with number of decision paths and loops, giving an indication of complexity and difficulty of testing. McCabe suggests that an upper limit of 10 should be imposed for projects.See tutorial.

Example:
1. source code:

cin>>a>>b>>c;                                                            //1
type=“scalene”;                                                             //2
if (a==b||b==c||a==c) type=“isosceles”;                         //3
if (a==b&&b==c) type=“equilateral”;                            //4
if (a>=b+c||b>=a+c||c>=a+b) type=“not a triangle”;       //5
if (a<=0||b<=0||c<=0) type=“bad inputs”;                      //6
cout<<type;                                                                   //7

2.Triangle CFG:

CFG consists of nodes and arcs
--node represents a straight-line section of code i.e. code without jumps in or out
--arc represents a possible sequence of execution
 
3.Cyclomatic Number:

C = e - n + 2 p
Therefore
number of edges,  e =  12;
number of nodes,  n = 9;
number of strongly connected components,      p = 1;
Then, C = 12 - 9 + 2 = 5;
 
Tool

3.1.3 Composite metrics

Purity ratio:

The code’s estimated length divided by its actual length. This is a measure of code optimization: The lower the ratio, the greater the probability that excessive code implements functionality.

The higher it is above 1.00, the more optimized the code. The standard purity ratio is 1.0 and ideal is a value greater then 1.25. Purity ratio also determines the extent impurities exist in the code. Programs that exceed their predicted length have latent impurities in them. There are six generally recognized classes of impurities: canceling of operators, ambiguous operands, synonymous operands, common sub-expressions, unnecessary replacements, and unfactored expressions.

It is calculated using formula: Purity Ratio = N^ / N.

Where N^ = predicted length as determined by Predicted Length

N = length as determined by Length

Development Effort:

Reflects the number of mental discriminations a programmer performs to write the program/function. This is critical measure of code quality.

It is calculated using formula: E = V / L^ .

Where:

L^ = 2/n1 * n1/N1
V = Volume(LOC)
n1 number of distinct operators in program
N1 total number of operator occurrences.
Functional Density:
An essential indicator of code's optimization and compactness. A function is self-contained unit of program code designed to accomplish a particular task (an action takes place or a value is provided); at a minimum, a function is based on an input (entry point) that results in an output (exit point).

It is calculated using formula, LOC / FP, where LOC = lines of code, FP = function points.

The ideal value of functional density is 36.

3.2 CRITICAL COVERAGE MEASURES

3.2.1 Executable one trip path per function:

Based on the segment/branch level analysis of the code, this provides a coverage analysis and testability indicator. Programs at the extreme maximum are quantitatively untestable or require excessive testing resource. The ideal value of this parameter is in between 20-50.

3.2.2 Average Number of Segments per Path:

The average number of logical links traversed between entry and exit point for all paths assigned to a function is a strong indicator of a function's relative performance. The ideal value of this parameter is in between 15-25.

3.3 CRITICAL SUMMARY MEASURES

3.3.1 Estimated development time per function (hours):

It is calculated using formula: T^ = E / S /60 / 60;

Where E = effort and S = Stroud Index (sliding index).

The normal index is 18 for "average" programmers. The index number for beginning programmers is 5-10 and up to 40+ for highly efficient.

The ideal value of T^ is between 1.5 – 2.5 and standard value is 4.5.

3.4 MAINTAINABILITY AND READABILITY MEASURES

3.4.1 Lines of code:
Should not exceed 62 lines of code per function

3.4.2 Number of executable statements:
Should not exceed 50 executable statements per function.

3.4.3 Number of comment lines:
Should be at least 60 % per function.

3.4.4 Span of reference for variables:
Average maximum number of lines between references to each variable assignment and use in a function. It should not be greater than 10-12 lines for various security reasons.

[Back to Top]
 


SOFTWARE QUALITY METRICS
4 - Quality Metrics for Object Oriented Environments

Object oriented analysis and design are popular concepts in today's software development environment. They are often heralded as the silver bullet for solving software problems, while in reality there is no silver bullet, object oriented has proved its value for systems that must be maintained and modified. Object oriented software development requires a different approach from more traditional functional decomposition and data flow development methods. While the functional and data analysis approaches commence by considering the systems behavior and/or data separately, object oriented analysis approaches the problem by looking or system entities that combine them. Object oriented analysis and design focuses on objects as the primary agents involved in a computation; each class of data and related operations are collected into a single system entity.

4.1 METRICS FOR OBJECT ORIENTED SYSTEMS
Many different metrics have been proposed for object oriented systems. The object oriented metrics that were chosen measure principle structures that, if they are improperly designed, negatively affect the design and code quality attributes.
The selected object oriented metrics are primarily applied to the concepts of classes, coupling, and inheritance

4.1.1 Class
A class is a template from which objects can be created. This set of objects share a common structure and a common behavior manifested by the set of methods. Three class metrics described here measure the complexity of a class using the class's methods, messages and cohesion.

4.1.1.1 Method
A method is an operation upon an object.

Weighted Methods per Class (WMC) :
The WMC is a count of the methods implemented within a class or the sum of the complexities of the methods (method complexity is measured by cyclomatic complexity). The second measurement is difficult to implement since not all methods are assessable within the class hierarchy due to inheritance. The number of methods and the complexity of the methods involved is a predictor of how much time and effort is required to develop and maintain the class. The larger the number of methods in a class, the greater the potential impact on children since children will inherit all the methods defined in a class. Classes with large numbers of methods are likely to be more application specific, limiting the possibility of reuse. This metric measures usability and reusability.

4.1.1.2 Message
A message is a request that an object makes of another object to perform an operation. The operation executed as a result of receiving a message is called a method. The next metric looks at methods and messages within a class.

Response for a Class (RFC) :
The RFC is the cardinality of the set of all methods that can be invoked in response to a message to an object of the class or by some method in the class. This includes all methods accessible within the class hierarchy. This metric looks at the combination of the complexity of a class through the number of methods and the amount of communication with other classes. The larger the number of methods that can be invoked from a class through messages, the greater the complexity of the class. If a large number of methods can be invoked in response to a message, the testing and debugging of the class becomes complicated since it requires a greater level of understanding on the part of the tester. A worst case value for possible responses will assist in the appropriate allocation of testing time. This metric evaluates system design as well as the usability and the testability.

4.1.1.3 Cohesion
Cohesion is the degree to which methods within a class are related to one another and work together to provide well-bounded behavior. Effective object-oriented designs maximize cohesion since it promotes encapsulation. The third class metric investigates cohesion.

Lack of Cohesion of Methods (LCOM):
LCOM measures the degree of similarity of methods by instance variable or attributes. Any measure of separateness of methods helps identify flaws in the design of classes. There are at least two different ways of measuring cohesion:
1. Calculate for each data field in a class what percentage of the methods use that data field. Average the percentages then subtract from 100%. Lower percentages mean greater cohesion of data and methods in the class.
2. Methods are more similar if they operate on the same attributes. Count the number of disjoint sets produced from the intersection of the sets of attributes used by the methods.
High cohesion indicates good class subdivision. Lack of cohesion or low cohesion increases complexity, thereby increasing the likelihood of errors during the development process. Classes with low cohesion could probably be subdivided into two or more subclasses with increased cohesion. This metric evaluates the design implementation s well as reusability.

Example:
1. class example.

Class Stack
        Attributes
        stack, stack-index(si)
Functions
        push - uses/def stack, si
        pop - uses/def si
        depth - uses si
        val - uses stack, si
2.

Node-left : set of attributes
Nodes-right: set of pairs of functions
Arcs - a pair of functions is connected to an attribute if both functions access that attribute

3. Lack of Cohesion of Methods
P ={ };
Q = {p-pop, p-d, p-v, pop-d, pop-v, v-d}

LCOM = max( |P| - |Q|, 0 ) = max(0 - 6, 0) = 0;
 

4.1.2 Coupling
Coupling is a measure of the strength of association established by a connection from one entity to another. Classes (objects) are coupled three ways:
1. When a message is passed between objects, the objects are said to be coupled.
2. Classes are coupled when methods declared in one class use methods or attributes of the other classes.
3. Inheritance introduces significant tight coupling between superclasses and their subclasses, (Since good object oriented design requires a balance between coupling and inheritance, coupling measures focus on non-inheritance coupling.)

Coupling Between Object Classes (CBO):
CBO is a count of the number of other classes to which a class is coupled. It is measured by counting the number of distinct non-inheritance related class hierarchies on which a class depends. Excessive coupling is detrimental to modular design and prevents reuse. The more independent a class is, the easier it is reuse in another application. The larger the number of couples, the higher the sensitivity to changes in other parts of the design and therefore maintenance is more difficult. Strong coupling complicates a system since a module is harder to understand, change or correct by itself if it is interrelated with other modules. Complexity can be reduced by designing systems with the weakest possible coupling between modules. This improves modularity and promotes encapsulation. CBO evaluates design implementation and reusability.

4.1.3 Inheritance
Another design abstraction in object oriented systems is the use of inheritance. Inheritance is a type of relationship among classes that enables programmers to reuse previously defined objects including variables and operators. Inheritance decreases complexity by reducing the number of operations and operators, but this abstraction of objects can make maintenance and design difficult. The two metrics used to measure the amount of inheritance are the depth and breadth of the inheritance hierarchy.

Depth of Inheritance Tree (DIT):
The depth of a class within the inheritance hierarchy is the maximum length from the class node to the root of the tree and is measured by the number of ancestor classes. The deeper a class is within the hierarchy, the greater the number methods it is likely to inherit making it more complex to predict its behavior. Deeper trees constitute greater design complexity, since more methods and classes are involved, but the greater the potential for reuse of inherited methods. A support metric for DIT is the number of methods inherited (NMI). This metric primarily evaluates reuse but also relates to understandability and testability.

Number of Children (NOC):
The number of children is the number of immediate subclasses subordinate to a class in the hierarchy. It is an indicator of the potential influence a class can have on the design and on the system. The greater the number of children, the greater the likelihood of improper abstraction of the parent and may be a case of misuse of subclassing. But the greater the number of children, the greater the reuse since inheritance is a form of reuse. If a class has a large number of children, it may require more testing of the methods of that class, thus increase the testing time. NOC, therefore, primarily evaluates testability and design.

[Back to Top]
 


SOFTWARE QUALITY METRICS
5 - Conclusions
 
Software Quality Metrics are very important because they supply the hard data an organization can use to obtain valuable information about project progress, organizational productivity, and enterprise profitability.  All practitioners employ some measurement of their code (even if they do not realize it) whenever they use a compiler or source library configuration management tools. Currently, many companies and other institutions are trying to adopt more complex and automated measurement tools into their development processes as they strive to improve their software products. The information obtained, can be used to intelligently plan and focus dynamic testing efforts and apply valuable resources where they will deliver the greatest impact.

Although, we can see that there is a significant amount of research been carried on this field, Software Metrics are well behind the state of the art due to various factors. Among them we can distinguish some, such as:

Any organization that wants to introduce metrics into its Software Development Process has to (1) evaluate itself to see if it is ready to take that step and  (2) define what they want to measure and why. Not complying with these requirements might lead to catastrophic results within the organization. There is a lot of literature on how to evaluate readiness factors, the scope of the paper does not allow us to cover the topic in more detail.

For a tutorial on how to calculate MacCabe's' Complexity follow this link. For a simple tool that calculates some of the metrics covered here for Java code visit Reliable Software Technologies' TotalMetric Web Page. To compare the results obtained by TotalMetric on your Java code you can use the following table that summarizes the measures and benchmarks adopted by the NSA's Software Engineering Applied Technology Center.
 
Measure
Standard
Ideal
Comments
Purity Ratio 1.0 >1.25 Automated GUI and DB procedure call range: .5-1+
Volume 3,200 <1,000 GUI: <3,200-10,000   DB: <3,200-7,500
Effort 300,00 <=100,000 GUI: <500,000-5,000,00  DB: <300,000-1,000,000
Cyclomatic complexity 10 <10 
6-7 for C++
2-10 GUI, <10-25 DB
Functional density 62 36
Executable path analysis per function <100 20-50
Average number of logical branch links per path <50 15-25
Estimated time to develop function in hours 4.5 1.5-2.5
Predicted errors / KLOC 4 <3
Size 62
Percent of comment lines per function 60 60+
Executable statements per function <50 15-30
Span of reference for variable 10-12
 

[Back to Top]


SOFTWARE QUALITY METRICS
APPENDIX

A - Book References

Deut88 : Deutsch, Michael S. / Willis, Ronald R. "Software Quality Engineering - A Total Technical and Management Approach" , Prentice-Hall Series in Software Engineering, 1988  (KSU Stack # QA 76.76.Q35 D48 1988)

Hatz93 : Hatzel, Bill, "Making Software Measurement Work: Building an Effective Measurement Program", QED Publishing Group, 1993 (KSU Stack # QA 76.76.Q35 H48 1993)

Jone97 : Jones, Caspers, "Software Quality: Analysis and Guide Lines for Success", International Thompson Computer Press, 1997 (KSU Stack # QA 76.76.Q35 J675 1997)

Kitc87 : Kitchenham, B.A. / Littlewood, B. "Measurements for Software Control and Assurance", Elsevier Applied Science, 1987 (KSU Stack # QA 76.76.Q35 C57 1988)
 
[Back to Top]



SOFTWARE QUALITY METRICS
B - Links to WWW references
http://www.uis.harvard.edu/year2000/testing/testtoole.html

Information on Software Quality and lot other
http://www.bullseye.com/webSqa.html

Software Metrics Home Page - The STSC Measurement Page, contains links to other related sites, articles
http://stsc.hill.af.mil/Metrics/index.html

Software Quality Assurance and Metrics - Slide show.
http://www.taejon-c.ac.kr/~kolee/se_note/sld020.htm 

Software Quality Assurance and Testing Tools
http://wwwlis.iei.pi.cnr.it/LIS/Overview/tools.html

Software Quality Assurance Plan - Sample Plan
http://www.nfra.nl/~seg/SQAP-1.0.html

Software Quality, metrics, etc.
http://louisa.levels.unisa.edu.au/se1/management-sqa/week7a96.htm

Software Quality Metrics - Page containing lots of links
http://av.yahoo.com/bin/query?p=software+quality+metrics&hc=0&hs=5

Software Quality Page - Many links to related sites
http://www.tiac.net/users/pustaver/

University of South Australia: good paper on Software Quality Assurance.
http://louisa.levels.unisa.edu.au/se1/management-sqa/week7a96.htm

Warren Harrison's Software Metrics Homepage
http://www.cs.pdx.edu/~warren/Metrics/coverpage.html

[Back to Top]


SOFTWARE QUALITY METRICS
C - Links to Tools

Bullseye Testing Tech.: C-Cover -- Provides instrumentation-based test coverage analysis of C programs. (Hatz93)
http://www.bullseye.com/

Computer Associates: CA-Metrics -- Project planning and metrics reposition support with comparison to industry norms and averages. (Hatz93) http://www.cai.com/

Compuware Corp.: Pathvu -- Portfolio analysis for COBOL or Assembler programs providing static analysis of many common program measures like numbers of IF or looping statements, nesting level, percentage of control statements, verbs, data elements, element references, and so on. (Hatz93)
http://www.compuware.com/

Logiscope: is a range of products to improve programming quality, maintenance and test coverage through a comprehensive source code tool set. Its high level of integration and its pragmatic approach allows developers to use it as a natural complement to compiling , profiling and debugging environments as well as "Capture/Replay" test tools. (Hatz93) * Downloadable *
http://www.verilogusa.com/log/logiscop.htm

McCabe and Associates - Vendors of a range of commercial tools for metric and test analysis such as ACT. ACT - Analysis of Complexity Tool to analyze source code and plot its control flow graphically. The tool provides basic complexity measures, quantifies the number of tests needed, and identifies test paths and conditions needed to achieve structural code coverage.
http://www.mccabe.com/

Panorama: a version comparator for an entire C/C++ program (rather than a file only). * Downloadable *
http://www.softwareautomation.com/

Reliable Software Technologies - Offers an introductory version of TotalMetric for Java. Metrics currently included are Cyclomatic Complexity, Extended C.C., Difficulty, Effort, Volume, JavaDoc, Method count. * Downloadable after registration *
http://www.rstcorp.com

Software Quality measuring tools
http://www.ashling.com/sqa.html

Software Quality metrics for large scale systems development
http://www.macdett.com/incose/incos96b.htm

Software Research - Offers METRICS as a part of the Test Works/Advisor tool suite to quantitatively determine source code quality.
http://www.soft.com/ [Back to Top]


SOFTWARE QUALITY METRICS
[Back to Top]
Contact me:zijianATcis.ksu.edu
Or to my homepage: http://www.cis.ksu.edu/~zijian


 

Last Updated on: 11/14/99 - 10:30 am.