PROFORMA – E
Research Proposals
SOFTWARE RELIABILITY GROWTH MODEL WITH TESTING EFFORT
1. INTRODUCTION:
A computer system consists of two major components: hardware and software.
Although extensive research has been done in the area of hardware reliability, research
has also been conducted to study the software reliability of computer systems since 1970.
Software reliability is the probability that given software will be functioning without
failure in a given environment during a specified period of time. Hence, software
reliability is a key factor in software development process and software quality. The
testing phase is an important and expensive part during the software development process
which includes the following four phases: specification, design, programming and test-
and-debug. Many resources are consumed by a software development project. Most
papers assumed that the consumption rate of testing resource expenditures during the
testing phase is a constant or even do not consider such testing effort. In reality, software
reliability models should be developed by incorporating different testing-effort functions.
Yamada and Musa proposed a new and simple software reliability growth model which
describes the relationship among the calendar testing, the amount of testing-effort, and the
number of software errors detected. The test-effort is measured by the number of CPU
hours, the number of executed test cases; and so on.
SRGMs proposed by most papers incorporate the effect of testing effort in the
software reliability growth and the software development effort can be described by the
traditional Rayleigh, Weibull or Exponential curve. However, in many software testing
environments it is difficult to describe the testing-effort function by the above three
consumption curves. In this paper, we thus will show that a logistic testing-effort function
can be expressed as a software development test effort curve. Experiments have been
performed based on three real test debug data sets. The results show that the SRGMs with
a logistic testing-effort function can estimate the number of initial faults better than
previous approaches.
2. NEED AND IMPORTANCE OF RESEARCH PROBLEM:
We will briefly review some testing -effort functions. During software testing
phase, it consumes much test-effort, such as man power, number of test cases, and CPU
time. Traditionally, the test-effort during the testing phase and the time-dependent
behavior of development effort in the software development process can be described by
an Exponential, a Rayleigh or a Weibull curve, which were proposed by Yamada, Musa,
Putnam and Kapur. Let W(t) be the cumulative amount of testing-effort expenditures in
the testing time interval (0, t) and g(t) be the consumption rate of the testing effort
expenditures. Thus, the testing-effort consumed per unit time is assumed to the
proportional to the remaining amount of the testing-effort expenditures a - W(t).
Assuming that software reliability can somehow be measured, a logical question is
what purpose does it serve. Software reliability is a useful measure in planning and
controlling resources during the development process so that high quality software can be
developed. It is also a useful measure for giving the user confidence about software
correctness. Planning and controlling the testing resources via the software reliability
measure can be done by balancing the additional cost of testing and the corresponding
improvement in software reliability. As more and more faults are exposed by the testing
and verification process, the additional cost of exposing the remaining faults generally
rises very quickly. Thus, there is a point beyond which continuation of testing to further
improve the quality of software can be justified only if such improvement is cost
effective. An objective measure like software reliability can be used to study such a
tradeoff.
Current approaches for measuring software reliability basically parallel those used
for hardware reliability assessment with appropriate modifications to account for the
inherent differences between software and hardware. For example, hardware exhibits
mixtures of decreasing and increasing failure rates. The decreasing failure rate is seen due
to the fact that, as test or use time on the hardware system accumulates, failures, most
likely due to design errors, are encountered and their causes are fixed. The increasing
failure rate is primarily due to hardware component wearout or aging. There is no such
thing as wearout in software. It is true that software may become obsolete because of
changes in the user and computing environment, but once we modify the software to
reflect these changes, we no longer talk of the same software but of an enhanced or a
modified version. Like hardware, software exhibits a decreasing failure rate
(improvement in quality) as the usage time on the system accumulates and faults, say, due
to design and coding, are fixed. It should also be noted that an assessed value of the
software reliability measure is always relative to a given use environment. Two users
exercising two different sets of paths in the same software are likely to have different
values of software reliability.
3. OBJECTIVES:
A number of analytical models have been proposed to address the problem of
software reliability measurement. These approaches are based mainly on the failure
history of software and can be classified according to the nature of the failure process
studied as indicated.
4. METHODOLOGY:
Fault Count Models: This class of models is concerned with modeling the
number of failures seen or faults detected in given testing intervals. As faults are
removed, from the system, it is expected that the observed number of failures per unit
time will decrease. If this is so, then the cumulative number of failures versus time curve
will eventually level off. Note that time here can be calander time, CPU time, number of
test cases run or some other relevant metric. In this setup, the time intervals may be fixed
a priori and the observed number of failures in each interval is treated as a random
variable.
Several models have been suggested to describe such failure phenomena. The
basic idea behind most of these models is that of a Poisson distribution whose parameter
takes different forms for different models. It should be noted that Poisson distribution has
been found to be an excellent model in many fields of application where interest is in the
number of occurrences.
Goel-Okumoto Nonhomogeneous Poission Process Model:
In this model Goel and Okumoto model assumed that a software system is subject
to failures at random times caused by faults present in the system. Letting N(t) be the
cumulative number of failures observed by time t, they proposed that N(t) can be modeled
as a nonhomogeneous Poisson process, i.e., as a Poisson process with a time dependent
failure rate. Based on their study of actual failure data from many systems, they proposed
the following form of the model
P{N(t)=y} = (m(t))y e–m(t) , y = 0,1,2,……
y!
where
m(t) = a(1 - e-bt),
and
λ(t) ≡ m'(t) = abe-bt.
Here m(t) is the expected number of failures observed by time t and λ(t) is the
failure rate. Typical plots of the m(t) and λ(t) functions are there. In this model a is the
expected number of failures to be observed eventually and b is the fault detection rate per
fault. It should be noted that here the number of faults to be detected is treated as a
random variable whose observed value depends on the test and other environmental
factors. This is a fundamental departure from the other models which treat the number of
faults to be a fixed unknown constant. Then the reliability function is given by R xk / Sk-1
(x/s) = e-{m(s+x)-m(s)}
5. SIZE OF SAMPLES:
An Example of Software Reliability Modeling
We now employ the above procedure to illustrate the development of a software
reliability model based on failure data from a real-time, command and control system.
The delivered number of object instructions for this system was 21 700 and it was
developed by Bell Laboratories. The data were reported by Musa and represent the
failures observed during system testing for 25 hours of CPU time.
For purposes of this illustration, we employ the NHPP model of Goel and
Okumoto. We do so because of its simplicity and applicability over a wide range of
testing situations as also noted by Misra, who successfully used this model to predict the
number of remaining faults in a space shuttle software subsystem.
Step 1: The original data were reported as times between failures. To overcome the
possible lack of independence among these values, we summarized the data into numbers
of failures per hour of execution time. A plot of N(t), the cumulative number of failures
by t.
Step 2: A study of the data in the plot indicates that the failure rate (number of failures per
hour) seems to be decreasing with test time. This means that an NHPP with a mean value
function m(t) = a( 1 – e-bt) should be a reasonable model to describe the failure process.
Step 3: For the above model, two parameters, a and b, are to be estimated from the failure
data. We chose to use the method of maximum likelihood for this purpose. The estimated
values for the two parameters are â = 142.32 and b = 0.1246. Recal that â is an estimate of
the expected total number of faults likely to be detected and b represents the number of
faults detected per fault per hour.
Step 4: The fitted model based on the data and the parameters estimated in Step 3 is
m(t) = 142.32 (1 – e-0.1246t) and λ(t) = 17.73 . e-0.1246t.
Step 5: In this case, we used the Kolmogorov-Smirnov goodness-of-fit test for checking
the adequacy of the model. Basically, the test provides a statistical comparison between
the actual data and the model chosen in Step 2. The fitted model in Step 4 passed this test
so that it could be considered to be a good descriptor of the data. The plots also provide a
visual check of the goodness-of-fit of the model.
Step 6: For illustration purposes, we computed only one performance measure, the
expected number of remaining faults, at various testing times. Plots of the confidence
bounds for the expected cumulative number of failures, and the expected number of
remaining faults are also shown. A study of these plots indicates that the chosen NHPP
model provides an excellent fit to the data and can be used for purposes of describing the
failure behavior as well as for prediction of the future failure process. The information
available from this can be used for planning, scheduling, and other management decisions
as indicated below.
Step7: The model developed above can be used for answering a variety of questions about
the failure process and for determining the additional test effort required until the system
is ready for release. This type of information can be sought at various points of time and
one does not have to wait until the end of testing. For illustrative purposes suppose that
failure data through only 16 hours of testing are available, and total of 122 failures have
been observed. Based on these data, the fitted model is m(t) = 138.37 (1- e-0l33t).
An estimate of the remaining number of faults is 16.37 with a 90 percent
confidence interval of (4.64-28.11). Also, the estimated one hour ahead reliability is
0.165 and the corresponding 90 percent confidence interval is (0.019-0.310).
Now, suppose that a decision to release software for operational use is to be based
on the number of remaining faults. Specifically, suppose that we would release the system
if the expected number of remaining faults is less than or equal to 10. In the above
analysis we saw that the best estimate of this quantity at present is 16.37, which means
that we should continue testing in the hope that additional faults can be detected and
removed. If we were to carry on a similar analysis after each additional hour of testing,
the expected number of remaining faults after 20 hours would be 9.85 so that the above
release criterion would be met.
6. HYPOTHESIS:
Based on the assumptions, if the number of detected errors due to the current
testing-effort expenditures is proportional to the number of remaining errors. It meant to
illustrate the kind of information that can be obtained from a software reliability model.
In practice, determination of release time, additional testing effort, etc. are based on much
more elaborate considerations than remaining faults. The results from models such as the
ones developed here can be used as inputs into the decision-making process.