ABSTRACT
This paper describes the different types of metrics that relate to
Software Quality and focuses on Software Product Metrics. Current techniques
and commercially available tools, included those that are used for Object
Oriented Design and Programming, are surveyed. A brief tutorial on some
of the more traditional metrics.
Before we start and focus on Software Product Metrics it is important to make clear some important points about Software Quality itself.
Software Quality is conformance to explicitly stated functional and performance requirements, explicitly documented development standards, and implicit characteristics that are expected of all professionally developed software. Furthermore, we should distinguish between software product quality and software process quality as illustrated in the following diagram:
| Software Quality
| | | | | +---> Product Quality : Attributes of: Documents / Designs / Code / Tests | +---------> Process Quality : Attributes of: Techniques / Tools / People / Organizations / Facilities Source: Deut88, p.10 |
Process quality, on the other hand, describes the attributes of the software development process itself. Five software environment elements that are present in all of the software project are taken into account, they are: techniques, tools, people, organization, and facilities. Process quality then would focus on, for example, the correct implementation of a technique, the productivity of a tool, the abilities of the programmers, the communicativeness of an organization, and how well suited are the installations and facilities.
Identifying Quality Metrics
Software metrics are any measurement which relate to a software system, process or related documentation. Following the same criteria as with Software Quality, Quality Metrics can be grouped according to their functionality in product, process, and people metrics.
Product metrics quantify useful attributes of the products you are generating, including those used for internal consumption. They help you assess if the product is good enough through reports on attributes like usability, reliability, maintainability and portability.
Process metrics quantify useful attributes of the software development process and its environment. They tell you if the process is functioning optimally as they report on attributes like cycle time and rework rate. The goal is to do the right job on first time through the process. Metrics give you feedback needed to achieve this goal.
People metrics quantify useful attributes of those generating the products using the available processes, methods and tools (infrastructure). they tell you if your people are being productive or if you have personal problems as they report on attributes like turnover rates, productivity, and absenteeism. The goal is to keep staff happy, motivated and focussed on the task at hand.
For this project we will focus on Software Product Quality, especially on metrics that apply to it.
From the early times when the firsts programs where coded and the first projects implemented, the people involved in this process have been concerned with the ideas and application of measurement. By measurement, we mean, quantified observations on some attribute or aspect of the software product.
While any measurement provides us with some additional knowledge, what ultimately distinguishes the relative small number of "useful" measurements from the vast array of possible measurements is the degree of insight obtained and the usefulness of the information balanced against the cost and effort that takes to obtain them.
Software product metrics help us in various ways:
According to their functionality, they are categorized as:
2.2 Metrics Types
Token metrics are calculated by counting tokens in the source code of a system, program or program unit. A token is a simple entity which makes up a program; for example, if, for, and usr_name are tokens in C: the first two are key words while the third is a variable identifier. Some of the most well known token metrics are the one proposed by M.H. Halstead who was professor at Purdue University.
2.2.2 Control flow metrics
With control flow metrics, each program unit in a software system is explored and the flow of control determined. Graphical notation is used to visualize the control structure complexity of the unit being analyzed. The original work on Cyclomatic Complexity, one of the first control flow metrics know, was supplied by Thomas McCabe in the mid 70s.
2.2.3 Composite metrics
Many researchers have pointed out that metrics solely based on control structure complexity are inadequate, that although control structure complexity has an obvious effect on process metrics such as time to debug, other factors are important. Hansen and Oviedo, for example, proposed hybrid metrics that combine control flow metrics with token metrics in order to overcome the weaknesses of simple metrics.
2.2.4 System metrics
The previous categories of metric, token-based and control flow, concentrate on measuring the property of program code. System metrics measure large-scale properties of a system, usually the quality of a system design. They are considered the most useful because they can be extracted during an early phase of a project, system design, and hence are capable of predicting more. Much of the work that has been carried out on system metrics has been inspired by Christopher Alexander's book "Notes on the synthesis of form". In this work he demonstrated that the quality of a design of an object, be a refrigerator, building, electronic circuit or bridge, was intimately related to the degree of interconnection that occurred in the object; it is this degree of interconnection that system metrics are intended to measure.
3.1 CRITICAL CODE MEASURES
3.1.1 Token Metrics
Halstead metrics :
Token metrics, as we have already seen, are calculated by counting
tokens in the source code of a system, program or program unit. A token
is a single entry, which makes up a program; for example key words and
variable identifiers are tokens. In 1977 Maurice Halstead published a book
called "Elements of Software Science". The book described various code-based
measurements. They are summarized next.
By obtaining the following values:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1. source code:
Z = 0;
while X > 0
Z = Z + Y ;
X = X - 1 ;
end-while ;
print (Z) ;
2. count:
-operators: --operands
=
3
Z
4
;
5
0
2
while-endwhile 1
X
3
>
1
Y
2
+
1
1
1
-
1
print
1
()
1
then: n1 = 8 , n2 =5, N1 = 14, N2 = 12.
3. We can calculate:
Length:
N = N1 + N2 = 26
Vocabulary:
h = ?1 + ?2 =13
Predicted Length: N = (?1*
log ?1) + (?2 * log ?2) = (8 * log 8 ) + (5 * log 5 ) = 35.5
Program Volume: V = N log2
( ?1 + ?2 ) = 96.2
Potential Volume: V* = (2+
?2*) log2 (2+ ?2*) = 8
Implementation level: L = V*/V = 0.083
Effort:
E = V/L = 1159
Time:
T = E / P = 64(sec)
Predicted Bugs:
B = V / 3000 = 0.003
3.1.2 Control flow metrics
MacCabe's Complexity Metric:
Proposed by Thomas McCabe, it is based on control flow within the code.
A program graph is used to depict control flow. Each circle represents
a processing task and directed arrows represent flow control. McCabe defines
(and uses) the cyclomatic complexity from the graph.
Cyclomatic complexity increases with number of decision paths and loops,
giving an indication of complexity and difficulty of testing. McCabe suggests
that an upper limit of 10 should be imposed for projects.See tutorial.
Example:
1. source code:
cin>>a>>b>>c;
//1
type=“scalene”;
//2
if (a==b||b==c||a==c) type=“isosceles”;
//3
if (a==b&&b==c) type=“equilateral”;
//4
if (a>=b+c||b>=a+c||c>=a+b) type=“not a triangle”;
//5
if (a<=0||b<=0||c<=0) type=“bad inputs”;
//6
cout<<type;
//7
2.Triangle CFG:
CFG consists of nodes and arcs
--node represents a straight-line section of code i.e. code without
jumps in or out
--arc represents a possible sequence of execution
3.Cyclomatic Number:
C = e - n + 2 p
Therefore
number of edges, e = 12;
number of nodes, n = 9;
number of strongly connected components,
p = 1;
Then, C = 12 - 9 + 2 = 5;
Tool
3.1.3 Composite metrics
Purity ratio:
The code’s estimated length divided by its actual length. This is a measure of code optimization: The lower the ratio, the greater the probability that excessive code implements functionality.
The higher it is above 1.00, the more optimized the code. The standard purity ratio is 1.0 and ideal is a value greater then 1.25. Purity ratio also determines the extent impurities exist in the code. Programs that exceed their predicted length have latent impurities in them. There are six generally recognized classes of impurities: canceling of operators, ambiguous operands, synonymous operands, common sub-expressions, unnecessary replacements, and unfactored expressions.
It is calculated using formula: Purity Ratio = N^ / N.
Where N^ = predicted length as determined by Predicted Length
N = length as determined by Length
Development Effort:
Reflects the number of mental discriminations a programmer performs to write the program/function. This is critical measure of code quality.
It is calculated using formula: E = V / L^ .
Where:
It is calculated using formula, LOC / FP, where LOC = lines of code, FP = function points.
The ideal value of functional density is 36.
3.2 CRITICAL COVERAGE MEASURES
3.2.1 Executable one trip path per function:
Based on the segment/branch level analysis of the code, this provides a coverage analysis and testability indicator. Programs at the extreme maximum are quantitatively untestable or require excessive testing resource. The ideal value of this parameter is in between 20-50.
3.2.2 Average Number of Segments per Path:
The average number of logical links traversed between entry and exit point for all paths assigned to a function is a strong indicator of a function's relative performance. The ideal value of this parameter is in between 15-25.
3.3 CRITICAL SUMMARY MEASURES
3.3.1 Estimated development time per function (hours):
It is calculated using formula: T^ = E / S /60 / 60;
Where E = effort and S = Stroud Index (sliding index).
The normal index is 18 for "average" programmers. The index number for beginning programmers is 5-10 and up to 40+ for highly efficient.
The ideal value of T^ is between 1.5 – 2.5 and standard value is 4.5.
3.4 MAINTAINABILITY AND READABILITY MEASURES
3.4.1 Lines of code:
Should not exceed 62 lines of code per function
3.4.2 Number of executable statements:
Should not exceed 50 executable statements per function.
3.4.3 Number of comment lines:
Should be at least 60 % per function.
3.4.4 Span of reference for variables:
Average maximum number of lines between references to each variable
assignment and use in a function. It should not be greater than 10-12 lines
for various security reasons.
Object oriented analysis and design are popular concepts in today's software development environment. They are often heralded as the silver bullet for solving software problems, while in reality there is no silver bullet, object oriented has proved its value for systems that must be maintained and modified. Object oriented software development requires a different approach from more traditional functional decomposition and data flow development methods. While the functional and data analysis approaches commence by considering the systems behavior and/or data separately, object oriented analysis approaches the problem by looking or system entities that combine them. Object oriented analysis and design focuses on objects as the primary agents involved in a computation; each class of data and related operations are collected into a single system entity.
4.1 METRICS FOR OBJECT ORIENTED SYSTEMS
Many different metrics have been proposed for object oriented systems.
The object oriented metrics that were chosen measure principle structures
that, if they are improperly designed, negatively affect the design and
code quality attributes.
The selected object oriented metrics are primarily applied to the concepts
of classes, coupling, and inheritance
4.1.1 Class
A class is a template from which objects can be created. This set of
objects share a common structure and a common behavior manifested by the
set of methods. Three class metrics described here measure the complexity
of a class using the class's methods, messages and cohesion.
4.1.1.1 Method
A method is an operation upon an object.
Weighted Methods per Class (WMC) :
The WMC is a count of the methods implemented within a class or the
sum of the complexities of the methods (method complexity is measured by
cyclomatic complexity). The second measurement is difficult to implement
since not all methods are assessable within the class hierarchy due to
inheritance. The number of methods and the complexity of the methods involved
is a predictor of how much time and effort is required to develop and maintain
the class. The larger the number of methods in a class, the greater the
potential impact on children since children will inherit all the methods
defined in a class. Classes with large numbers of methods are likely to
be more application specific, limiting the possibility of reuse. This metric
measures usability and reusability.
4.1.1.2 Message
A message is a request that an object makes of another object to perform
an operation. The operation executed as a result of receiving a message
is called a method. The next metric looks at methods and messages within
a class.
Response for a Class (RFC) :
The RFC is the cardinality of the set of all methods that can be invoked
in response to a message to an object of the class or by some method in
the class. This includes all methods accessible within the class hierarchy.
This metric looks at the combination of the complexity of a class through
the number of methods and the amount of communication with other classes.
The larger the number of methods that can be invoked from a class through
messages, the greater the complexity of the class. If a large number of
methods can be invoked in response to a message, the testing and debugging
of the class becomes complicated since it requires a greater level of understanding
on the part of the tester. A worst case value for possible responses will
assist in the appropriate allocation of testing time. This metric evaluates
system design as well as the usability and the testability.
4.1.1.3 Cohesion
Cohesion is the degree to which methods within a class are related
to one another and work together to provide well-bounded behavior. Effective
object-oriented designs maximize cohesion since it promotes encapsulation.
The third class metric investigates cohesion.
Lack of Cohesion of Methods (LCOM):
LCOM measures the degree of similarity of methods by instance variable
or attributes. Any measure of separateness of methods helps identify flaws
in the design of classes. There are at least two different ways of measuring
cohesion:
1. Calculate for each data field in a class what percentage of the
methods use that data field. Average the percentages then subtract from
100%. Lower percentages mean greater cohesion of data and methods in the
class.
2. Methods are more similar if they operate on the same attributes.
Count the number of disjoint sets produced from the intersection of the
sets of attributes used by the methods.
High cohesion indicates good class subdivision. Lack of cohesion or
low cohesion increases complexity, thereby increasing the likelihood of
errors during the development process. Classes with low cohesion could
probably be subdivided into two or more subclasses with increased cohesion.
This metric evaluates the design implementation s well as reusability.
Example:
1. class example.
Class Stack
Attributes
stack, stack-index(si)
Functions
push - uses/def stack, si
pop - uses/def si
depth - uses si
val - uses stack, si
2.
Node-left : set of attributes
Nodes-right: set of pairs of functions
Arcs - a pair of functions is connected to an attribute if both functions
access that attribute
3. Lack of Cohesion of Methods
P ={ };
Q = {p-pop, p-d, p-v, pop-d, pop-v, v-d}
LCOM = max( |P| - |Q|, 0 ) = max(0 - 6, 0) = 0;
4.1.2 Coupling
Coupling is a measure of the strength of association established by
a connection from one entity to another. Classes (objects) are coupled
three ways:
1. When a message is passed between objects, the objects are said to
be coupled.
2. Classes are coupled when methods declared in one class use methods
or attributes of the other classes.
3. Inheritance introduces significant tight coupling between superclasses
and their subclasses, (Since good object oriented design requires a balance
between coupling and inheritance, coupling measures focus on non-inheritance
coupling.)
Coupling Between Object Classes (CBO):
CBO is a count of the number of other classes to which a class is coupled.
It is measured by counting the number of distinct non-inheritance related
class hierarchies on which a class depends. Excessive coupling is detrimental
to modular design and prevents reuse. The more independent a class is,
the easier it is reuse in another application. The larger the number of
couples, the higher the sensitivity to changes in other parts of the design
and therefore maintenance is more difficult. Strong coupling complicates
a system since a module is harder to understand, change or correct by itself
if it is interrelated with other modules. Complexity can be reduced by
designing systems with the weakest possible coupling between modules. This
improves modularity and promotes encapsulation. CBO evaluates design implementation
and reusability.
4.1.3 Inheritance
Another design abstraction in object oriented systems is the use of
inheritance. Inheritance is a type of relationship among classes that enables
programmers to reuse previously defined objects including variables and
operators. Inheritance decreases complexity by reducing the number of operations
and operators, but this abstraction of objects can make maintenance and
design difficult. The two metrics used to measure the amount of inheritance
are the depth and breadth of the inheritance hierarchy.
Depth of Inheritance Tree (DIT):
The depth of a class within the inheritance hierarchy is the maximum
length from the class node to the root of the tree and is measured by the
number of ancestor classes. The deeper a class is within the hierarchy,
the greater the number methods it is likely to inherit making it more complex
to predict its behavior. Deeper trees constitute greater design complexity,
since more methods and classes are involved, but the greater the potential
for reuse of inherited methods. A support metric for DIT is the number
of methods inherited (NMI). This metric primarily evaluates reuse but also
relates to understandability and testability.
Number of Children (NOC):
The number of children is the number of immediate subclasses subordinate
to a class in the hierarchy. It is an indicator of the potential influence
a class can have on the design and on the system. The greater the number
of children, the greater the likelihood of improper abstraction of the
parent and may be a case of misuse of subclassing. But the greater the
number of children, the greater the reuse since inheritance is a form of
reuse. If a class has a large number of children, it may require more testing
of the methods of that class, thus increase the testing time. NOC, therefore,
primarily evaluates testability and design.
Although, we can see that there is a significant amount of research been carried on this field, Software Metrics are well behind the state of the art due to various factors. Among them we can distinguish some, such as:
For a tutorial
on how to calculate MacCabe's' Complexity follow this link. For a simple
tool that calculates some of the metrics covered here for Java code visit
Reliable Software Technologies'
TotalMetric Web Page. To compare the results obtained by TotalMetric on
your Java code you can use the following table that summarizes the measures
and benchmarks adopted by the NSA's Software Engineering Applied Technology
Center.
|
|
|
|
|
| Purity Ratio | 1.0 | >1.25 | Automated GUI and DB procedure call range: .5-1+ |
| Volume | 3,200 | <1,000 | GUI: <3,200-10,000 DB: <3,200-7,500 |
| Effort | 300,00 | <=100,000 | GUI: <500,000-5,000,00 DB: <300,000-1,000,000 |
| Cyclomatic complexity | 10 | <10
6-7 for C++ |
2-10 GUI, <10-25 DB |
| Functional density | 62 | 36 | |
| Executable path analysis per function | <100 | 20-50 | |
| Average number of logical branch links per path | <50 | 15-25 | |
| Estimated time to develop function in hours | 4.5 | 1.5-2.5 | |
| Predicted errors / KLOC | 4 | <3 | |
| Size | 62 | ||
| Percent of comment lines per function | 60 | 60+ | |
| Executable statements per function | <50 | 15-30 | |
| Span of reference for variable | 10-12 |
A - Book References
Deut88 : Deutsch, Michael S. / Willis, Ronald R. "Software Quality Engineering - A Total Technical and Management Approach" , Prentice-Hall Series in Software Engineering, 1988 (KSU Stack # QA 76.76.Q35 D48 1988)
Hatz93 : Hatzel, Bill, "Making Software Measurement Work: Building an Effective Measurement Program", QED Publishing Group, 1993 (KSU Stack # QA 76.76.Q35 H48 1993)
Jone97 : Jones, Caspers, "Software Quality: Analysis and Guide Lines for Success", International Thompson Computer Press, 1997 (KSU Stack # QA 76.76.Q35 J675 1997)
Kitc87 : Kitchenham, B.A. / Littlewood, B.
"Measurements for Software Control and Assurance", Elsevier Applied Science,
1987 (KSU Stack # QA 76.76.Q35 C57 1988)
[Back to Top]
Information on Software Quality and lot other
http://www.bullseye.com/webSqa.html
Software Metrics Home Page - The STSC Measurement
Page, contains links to other related sites, articles
http://stsc.hill.af.mil/Metrics/index.html
Software Quality Assurance and Metrics - Slide show.
http://www.taejon-c.ac.kr/~kolee/se_note/sld020.htm
Software Quality Assurance and Testing Tools
http://wwwlis.iei.pi.cnr.it/LIS/Overview/tools.html
Software Quality Assurance Plan - Sample Plan
http://www.nfra.nl/~seg/SQAP-1.0.html
Software Quality, metrics, etc.
http://louisa.levels.unisa.edu.au/se1/management-sqa/week7a96.htm
Software Quality Metrics - Page containing lots of links
http://av.yahoo.com/bin/query?p=software+quality+metrics&hc=0&hs=5
Software Quality Page - Many links to related
sites
http://www.tiac.net/users/pustaver/
University of South Australia: good paper on Software Quality
Assurance.
http://louisa.levels.unisa.edu.au/se1/management-sqa/week7a96.htm
Warren Harrison's Software Metrics Homepage
http://www.cs.pdx.edu/~warren/Metrics/coverpage.html
Bullseye Testing Tech.: C-Cover -- Provides instrumentation-based
test coverage analysis of C programs. (Hatz93)
http://www.bullseye.com/
Computer Associates: CA-Metrics -- Project planning and metrics reposition support with comparison to industry norms and averages. (Hatz93) http://www.cai.com/
Compuware Corp.: Pathvu -- Portfolio analysis for COBOL or Assembler
programs providing static analysis of many common program measures like
numbers of IF or looping statements, nesting level, percentage of control
statements, verbs, data elements, element references, and so on. (Hatz93)
http://www.compuware.com/
Logiscope: is a range of products to improve programming quality,
maintenance and test coverage through a comprehensive source code tool
set. Its high level of integration and its pragmatic approach allows developers
to use it as a natural complement to compiling , profiling and debugging
environments as well as "Capture/Replay" test tools. (Hatz93) * Downloadable
*
http://www.verilogusa.com/log/logiscop.htm
McCabe and Associates - Vendors of a range of commercial tools
for metric and test analysis such as ACT. ACT - Analysis of Complexity
Tool to analyze source code and plot its control flow graphically. The
tool provides basic complexity measures, quantifies the number of tests
needed, and identifies test paths and conditions needed to achieve structural
code coverage.
http://www.mccabe.com/
Panorama: a version comparator for an entire C/C++ program (rather
than a file only). * Downloadable *
http://www.softwareautomation.com/
Reliable Software Technologies - Offers an introductory version
of TotalMetric for Java. Metrics currently included are Cyclomatic Complexity,
Extended C.C., Difficulty, Effort, Volume, JavaDoc, Method count. * Downloadable
after registration *
http://www.rstcorp.com
Software Quality measuring tools
http://www.ashling.com/sqa.html
Software Quality metrics for large scale systems development
http://www.macdett.com/incose/incos96b.htm
Software Research - Offers METRICS as a part of the Test Works/Advisor
tool suite to quantitatively determine source code quality.
http://www.soft.com/ [Back
to Top]
Last Updated on: 11/14/99 - 10:30 am.