Analyzing Test Case Selection & Prioritization Using ACO: ACM SIGSOFT Software Engineering Notes September 2011

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/220630944

Analyzing test case selection & prioritization using ACO

Article  in  ACM SIGSOFT Software Engineering Notes · September 2011


DOI: 10.1145/2020976.20209905 · Source: DBLP

CITATIONS READS
15 152

2 authors:

Bharti Suri Shweta Singhal


Guru Gobind Singh Indraprastha University Guru Gobind Singh Indraprastha University
56 PUBLICATIONS   322 CITATIONS    16 PUBLICATIONS   122 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Search based software testing View project

Swarm Intelligence View project

All content following this page was uploaded by Bharti Suri on 06 July 2016.

The user has requested enhancement of the downloaded file.


ACM SIGSOFT Software Engineering Notes 1/5 November 2011 Volume 36 Number 6

Analyzing Test Case Selection & Prioritization using ACO


Bharti Suri Shweta Singhal
Assistant Professor (Computer Science Department) Assistant Professor (Information Technology)
University School of Information Technology Jagan Institute of Management Studies
Guru Gobind Singh Indraprastha University, Dwarka, 3, Institutional Area, Sector-5, Rohini, Delhi-110085, India
Delhi-110075, India Telephone: +91 11 45184019
[email protected] [email protected]

ABSTRACT coverage total or additional, branch coverage total or additional, failure


Regression testing is primarily a maintenance activity that is performed rates, or total fault exposing potential (FEP) [5] of the test cases.
frequently to ensure the validity of the modified software. In such cases,
due to time and cost constraints, the entire test suite cannot be run. Regression test prioritization is often performed in a time constrained
Thus, it becomes essential to select or prioritize the tests in order to environment where the testing is done for a fixed period of time.
cover maximum faults in minimum time. Recently, Ant Colony Walcott et al [8] in 1996 gave one such technique for time-aware test
Optimization (ACO), which is a new way to solve time constraint case prioritization. Time-aware prioritization intelligently schedules the
prioritization problem, has been utilized. This paper presents the test suite in terms of both the execution time and potential fault
analysis of the regression test prioritization technique to reorder test detection information. Walcott et al used the genetic algorithms to solve
suites in time constraint environment along with the sample runs on this problem. In the year 2010, Singh et al. [6] also proposed a time-
various programs. Our analysis concluded that the ACO finds better constrained prioritization technique using Ant Colony Optimization
orderings at higher values of the time constraint (TC). The correctness (ACO). The results shown in the paper [6] provide motivation for
of the technique has also been recorded to be near optimal at an implementing the algorithm and automate the technique. This algorithm
average. has been implemented and applied on two programs in [7]. In the paper
a tool called ACO_TCSP was developed for the same. The outcomes of
Categories and Subject Descriptors the execution provided near optimum results and further motivated to
D.2.5 [Software Engineering]: Testing and Debugging – testing tools, test the tool on various larger examples to confirm the generality of its
regression testing. achievements. In the present paper, the same tool ACO_TCSP was
tested on eight sample programs and for seven different time
General Terms constraints. The experiments yielded better results for ACO at higher
Experimentation, Verification. TC values with minimal effect of extra time taken by the algorithm. The
analysis also validates the selection and prioritization technique based
Keywords on the fault coverage and the execution time using ACO for providing
Regression Testing, Test Case Selection, Test Case near optimum results.
Prioritization, Ant Colony Optimization, Analysis.
2. ANT COLONY OPTIMIZTION
1. INTRODUCTION It is an astonishing discovery of entomologists that the real power of
Regression test case prioritization optimizes the ordering of test cases ants resides in their colony brain [2]. Ants are blind and they
that are to be executed to meet the criteria like maximum code coverage communicate within the colony by the use of a chemical substance
or high rate of fault detection. In [7], we implemented the prioritization called pheromone. As the ants move from their nest to food source and
and selection of test cases according to ACO technique [6] using fault vice-versa, they deposit the pheromone trail on that path. This
coverage based method along with execution time of test cases. In this pheromone trail is then smelled by other ants which tend to follow the
paper we extend our work to various programs for validation of path with maximum pheromone trail. The process is continued till the
effectiveness of the implemented technique. ants deposit sufficient food source in their nests. Point of our interest is
that, an ant reaches the food source and comes back to its nest. Thus,
The maintenance phase of a software product needs to go through the the ant on the shortest path is also the earliest to return, depositing more
inevitable regression testing process. It is required to retest the existing amount of pheromone on that path (while going and returning to the
test suite whenever any addition, deletion or modifications are made to nest). This increases the probability of other ants following this path [1].
the software. Re-running the test cases from the existing test suite to The ants following this path will again lay more pheromone on that
build the confidence in the correctness of the modified software is path. Thus even more ants tend to follow this path and continuing the
referred to as regression testing. Quite often software developers process, ultimately all the ants converge to the shortest or the best path.
encounter time and budget constraints, which makes it almost
impossible to rerun all the test cases within the specified constraints.
Thus we use test case minimization, selection and prioritization Derived from this real behavior of ants, Dorigo proposed Ant Colony
techniques for regression testing. Optimization (ACO) in the 1990’s [2]. This technique has already been
used in solving the various combinatorial optimization problems such as
Regression test selection is a process of reducing the test suite by knapsack problem, travelling salesman problem, distributed networks,
selecting a subset from the original test suite. Although this is a very data mining, telecommunication networks, vehicle routing, test data
cost effective method for regression testing but it can leave out certain generation [2, 3, 4, 9] etc. As proposed by Singh et al [6], ACO can be
important test cases from the selected subset of test cases. Regression applied to time constraint test suite selection and prioritization.
test prioritization means scheduling the test cases in an increasing or
decreasing order to meet some performance goal [5]. Various 3. DESCRIPTION OF THE TECHNIQUE
prioritization criteria may be applied to the regression test suite. Test The earlier proposed algorithm and its implementation have been
cases can be prioritized in the terms of random, optimal, statement described in brief in this section. For detailed algorithm and
implementation refer to [6, 7]. ACO_TCSP is a tool developed and
referred in [7] that input the test cases, their fault and their execution
DOI: 10.1145/2047414.2047431 http://doi.acm.org/10.1145/2047414.2047431
ACM SIGSOFT Software Engineering Notes 2/5 November 2011 Volume 36 Number 6

time information and results in the selected and prioritized ordering of 2. Percentage reduction in execution time, and
the test suite. 3. Percentage of the number of times the best path is found.

4.3 Design
Assuming ‘T’ = {t1, t2……tn} is the set of all ‘n’ number of test cases, To build the confidence in the technique ACO, we repeated ten runs for
and ‘F’ = {f1, f2……fx} be the set of all faults seeded in the program each of the programs with same test suites for seven different TC
under test. Some or all the faults in ‘F’ are covered by the test case {t1, values. Each run yielded the path found whether or not optimal. The
t2……tn} in the original test suite. We assume a fully connected time constraint (TC) chosen as the stopping criteria of the algorithm
undirected graph with the test cases as its nodes. ‘zi’ , is the time taken was taken to be 85 to 400 for 70 runs of each program. Time reduction,
while ants cover new nodes in the graph. For evaluating a complete optimality of technique and the correctness of the technique while
path, the total time constraint, TC, is taken to be MAX (user defined matching with optimum path were obtained from the output data.
input). The same numbers of artificial ants are generated as the number
of test cases. ‘S’, which consists of ‘m’ test cases (m<n, S T), is a
subset of the original test suite and consists of the selected test cases for 5. DATA ANALYSIS
each ant. There are ‘n’ subsets corresponding to ‘n’ ants. Pheromone For our experiment, out of the complete test pool, we randomly chose a
deposited is represented by ‘wi,’ the weight of the ith edge in the graph. test suite of five to twenty-six test cases for P1 to P8 respectively.
The deposition rate of pheromone is taken to be +1 for each edge Execution time for each of the test cases was then calculated and used
crossed by the ant on the optimal path. The evaporation rate of as an input to the ACO algorithm. The algorithm was executed 70 times
pheromone is selected to be 10%, which is reduced from the weight of each for the test programs with varying time constraint TC=85, 150,
each edge, after iteration is completed [2]. 200, 250, 300, 350 and 400. The results obtained are summarized in
Fig. 2- Fig. 7.

4. EXPERIMENTAL DESIGN
5.1 Time Analysis
4.1 Programs Previously, ACO_TCSP [7] was tested only for a constant value of the
For our analysis we used seven C++ and one Java program. Using fault TC on various programs. In this paper, time analysis for seven different
seeding technique, five to ten modified versions and black box test values of TC on all eight programs is presented. Ten runs for each case
cases for the programs were generated. Brief description about the were recorded, making a total of 560 ACO runs (10*7*8) on the test
programs, their sizes, versions and test suites are given in Table 1. data. For 381 out of 560 runs, the optimum path was found using
ACO_TCSP. This indicates a very high probability of getting optimal
Table 1. Details of the Selected 8 Test Programs
results using the ACO_TCSP tool.
Total Test
Prog Size No. of No. of
Program Lang Suite
ram Versi Test
Name uage LOC Execution
No. ons Cases
Time (sec)
CollAdm
P1 C++ 281 5 9 105.32
ission
HotelMg
P2 C++ 666 5 5 49.84
mnt
P3 triangle C++ 37 6 19 382
P4 quadratic C++ 38 8 19 441
cost_of_p
P5 C++ 31 8 19 382 Fig 2: Number of Best Paths found out of 10 Versus TC
ub
P6 calculator C++ 101 9 25 82.5
The numbers of best paths found by each test program for varying
P7 prev_day C++ 87 7 19 468
values of TC out of ten runs each were recorded. The same is
railway_ represented graphically in Fig. 2. It can be observed clearly that the
P8 Java 129 10 26 177
book number of best paths found increases with the increase in value of TC
for all programs except P3 and P5. Though different numbers of best
paths are found by each of the eight programs, but a general increasing
4.2 Variables trend with rising TC value can be inferred. P3 and P5 show ten out of
The independent variables manipulated by the experiment are: ten best paths found in all cases, hence there is no possibility of an
1. Subject programs with five to ten faults each. increasing trend with TC. It also shows the effectiveness of ACO_TCSP
2. Various Time Constraints (TC). in finding the optimum paths according to total fault coverage and
For each of the programs, on each run we measured: minimum execution time criteria.

1. The ratio of the total execution time of the test suite to the
reduced ACO execution time.
2. Whether the optimal path has been found or not.

For each program run, random nature of ACO led to different paths
using which three dependent variables were computed:
1. Percentage reduction in test suite size,

DOI: 10.1145/2047414.2047431 http://doi.acm.org/10.1145/2047414.2047431


ACM SIGSOFT Software Engineering Notes 3/5 November 2011 Volume 36 Number 6

Fig 3: Average Execution Time of ACO Paths versus TC Fig 5: Average No. of Iteration at which Optimum path is found
Versus TC

Average execution times of the final paths found on each of the ten runs
of ACO_TCSP for all the eight test programs with seven different TC The average number of iterations of ACO_TCSP execution out of ten
values are depicted graphically in Fig. 3. It can be inferred from the runs, at which the optimum path is found, was also noted. Its variation
figure that even though execution time of final paths for eight programs with different TC values is depicted in the graph and shown in Fig. 5. A
are different, but the general trend is a decrease in the execution time general increase in the number of iteration finding optimum path with
with an increase in TC value. Thus, as the TC is increased, paths with the increasing TC can be figured from Fig 6.8. P3 and P5 again show
less amount of execution time are found by the ACO_TCSP. It can be exceptional behavior as the optimum path for them is found in the 1st
thus derived from here that increased value of TC provides better iteration for all 70 test runs each. Thus increasing the value of TC
results. The exceptions for P3 & P5 are same as explained before. increases the chances of finding the optimum path by the proposed
technique.

Fig 4: Average No. of Iterations taken by ACO_TCSP V/s TC


Fig 6: Percentage Reduction in Execution time for ACO selected
test cases V/s TC
The average number of iterations out of ten runs each, taken by the
ACO_TCSP tool to execute with varying input TC values were
computed and the comparison of the average number of iterations with As has already been analyzed in the previous section, the technique
the TC values is shown in Fig. 4. It can be readily observed that there is provides near optimum reduction in execution time of the final selected
a linear increase in the number of iterations taken by ACO_TCSP with and prioritized test suite. The same has been plotted versus TC in Fig. 6.
increase in input TC values. This is justifiable as TC is the chosen The chosen TC values range from 85 to 400 seconds. Again, we can
criteria for stopping the execution of a single run of ACO_TCSP. Thus, clearly observe that higher percentage time reduction is achieved for
greater the value of TC more is the number of iterations taken by higher TC. P3 & P5 show the constant values because optimum time
ACO_TCSP to execute on the given test case and generate the selected reduction is achieved for all test runs. Thus, better results are obtained
and prioritized test suite as its output. for larger TC values.

5.2 Cost Benefit Analysis


The results obtained in the previous sections were further summarized
to find the cost benefit analysis of the technique. The cost of running the
ACO_TCSP in addition to the cost of running the selected and
prioritized test suite is compared with the cost of executing the
complete regression test suite.

DOI: 10.1145/2047414.2047431 http://doi.acm.org/10.1145/2047414.2047431


ACM SIGSOFT Software Engineering Notes 4/5 November 2011 Volume 36 Number 6

5.3 Correctness of the Technique


Another important aspect for analysis is the correctness of the
technique, i.e. how many times the technique is able to give the
optimum result (the best possible ordering). We computed the
percentage of the average number of times over 70 runs for all eight
programs, when the result was an optimum ordering. The values for the
percentage correctness of ACO_TCSP are presented graphically in the
Fig. 9.

Fig 7: Percentage Reduction in Test Suite using ACO for the test
programs

The percentage reduction in the size of selected test suite using


ACO_TCSP averaged over the 70 runs for all eight programs was
accrued and is shown pictorially in Fig. 7. The value for percentage
selection of test suite has been calculated using the formula:
% of test cases selected = (no. of selected tests * 100) / total
No. of test cases
Fig 9: Average Percentage Correctness using ACO
If ‘n’ is the size of a test suite and ‘s’ be the size of selected test suite
using ACO, then the percentage reduction in test suite size is computed
by the formula: On examining Fig. 9, it can be observed that for P3 and P5, the
ACO_TCSP proves to provide 100% correctness. For P1, P2, P4, P6
% reduction in test suite = (n-s)/n*100 %
and P7 a considerable amount of correctness is achieved using
ACO_TCSP. But for P8, only 18.57% correctness is achieved, which is
ACO provides very high (near optimum) reduction in the size of the test not an acceptable value. This might be due to the fact that even the
suite for all the test programs without any exception. highest chosen TC (400 sec) was not sufficient for this particular
program and better results might be achieved with a larger number of
iterations in a single ACO_TCSP run.
Similarly, we can calculate the percentage reduction in execution time
of the selected test suite using ACO, as compared to the complete
regression test suite. Let ‘T’ be the total execution time of the original 6. Conclusion
test suite, ‘Ta’ be the running time of ACO_TCSP, and ‘Ts’ be the In this paper, we validated the technique proposed in [6] and
execution time of selected test suite. The, percentage reduction in implemented in [7]. The results achieved are encouraging for the fact
execution time using ACO is formulated as: that i) the test suite selection and prioritization technique reduces the
% Execution Time Reduction = (T - (Ta + Ts)) / T * 100 % size of test suite,
ii) the execution time is considerably reduced,
The same formula is used for the computation of the average execution iii) the correctness achieved is very high for most of the test programs,
time reductions over 70 runs for each test program as depicted in Fig. 8. and
From the figure, it can be inferred that this technique proves to be very iv) the ordered test suite potentially enables us to discover the faults
effective in terms of time reduction as well. earlier.
The time analysis for the varying values of TC led to five major
observations:
1) More numbers of optimum (or best) paths are found at higher
values of TC.
2) Low execution time for the selected path is achieved for
higher TC values.
3) The number of iterations taken by ACO_TCSP is directly
proportional to the TC, or the two can be said to have a linear
relationship.
4) There are more chances of ACO_TCSP resulting in optimum
paths at higher values of TC.
5) Larger time reduction for the ACO selected test suite is
achieved at higher values of TC.

Fig 8: Percentage Execution Time Reduction using ACO

DOI: 10.1145/2047414.2047431 http://doi.acm.org/10.1145/2047414.2047431


ACM SIGSOFT Software Engineering Notes 5/5 November 2011 Volume 36 Number 6

All these observations imply that the ACO technique in prioritization


and selection gives better results at higher TC values with minimal
effect of extra time taken by the algorithm.

We have analyzed a selection and prioritization approach based on


ACO to find the solution that is nearest to the optimal solution. The
experimentation and analysis on the examples considered in the paper,
gave encouraging results. It can now be summarized that higher the
value of the TC, more is the number of iterations and hence, more
correct is the final outcome. ACO is a strong & robust heuristic
approach involving positive feedback and parallel computations and
hence, leads to the best or near best solutions in optimum time. Issue of
future research includes improvement of the technique so that all
programs give correct results. We also aim to compare it with other bio-
derived algorithms.

7. REFERENCES
[1] Caro, G. Di and Dorigo, M. 1998, AntNet: Distributed stigmergetic
control for communications networks. Journal of Artificial
Intelligence Research, 9, (1998), 317-365.
[2] Dorigo, M., Maniezzo, V., and Colorni, A. 1996. Ant System:
Optimization by a colony of cooperating agents. IEEE
Trannsactions on Systems, Man and Cybernetics, B(26), (1996),
29-41.
[3] Li, H., and Peng Lam, C. 2005. Software Test Data Generation
Using Ant Colony Optimization. In Transactions on Engineering,
Computing and Technology (2005).
[4] Parpinelli, R.S., Lopes, H.S., and Freitas, A.A. 2002. Data mining
with an ant colony optimization algorithm. IEEE Transactions on
Evolutionary Computation, 6, (2002), 321–332.
[5] Rothermel, G., Untch, R.H., Chu, C., and Harrold, M.J. 1999. Test
case prioritization: An empirical study, In Proceedings of the
International Conference on Software Maintenance, (1999), 179–
188.
[6] Singh, Y., Kaur, A., and Suri, B. 2010. Test Case Prioritization
Using Ant Colony optimization. Association in Computing
Machinery, Newsletter ACM SIGSOFT Software Engineering
Notes, New York, USA, (July 2010), 1-7.
[7] Suri, B., Singhal, S. 2011. Implementing Ant Colony Optimization
for Test Case Selection and Prioritization. International Journal on
Computer Science and Engineering, 3(5), (May 2011), 1924-1932.
[8] Walcott, K.R., Soffa M.L., Kapfhammer, G.M., and Roos, R.S.
2006. Time aware test suite prioritization. In Proceedings of
ACM/SIGSOFT International Symposium on Software Testing &
Analysis (ISSTA), Portland Maine, USA, (2006), 1–11.
[9] Zhao, P., Zhao, P. and Zhang, X. 2006. New Ant Colony
Optimization for the Knapsack Problem. (2006).

DOI: 10.1145/2047414.2047431 http://doi.acm.org/10.1145/2047414.2047431


View publication stats

You might also like