GREEDY DISCRETE ANT COLONY OPTIMIZATION FOR HIGH COVERAGE TEST SUITE GENERATION

Test suite optimization is significant problems in software engineering research to reduce testing cost of software program. Recently, few research works have been designed for test suite generation and reduction. However, there is a requirement for new technique to improve coverage rate of test suite generation and to remove redundant test cases. In order to overcome such limitations, a Greedy Discrete Ant Colony Optimization (GDACO) technique is proposed. The main objective of GDACO technique is to optimize the coverage capability of test suite generation. The GDACO technique initially used Ant Colony Optimization (ACO) algorithm for generating the test suites. The ACO algorithm selects test cases from test cases set based on trail’s probability and subsequently update pheromone trails until the maximum iteration is reached. This process results in generation of test suites for testing software programs. After that, GDACO technique used Greedy Discretization algorithm to test suite optimization. The Greedy Discretization algorithm designed in GDACO technique chooses the test cases which cover most test requirements and removes redundant test cases in test suites. Therefore, GDACO technique finally obtains minimal cardinality subset of test suites with higher coverage rate of faults. The GDACO technique conducts the experimental works on parameters such as coverage rate, average rate of test suite reduction and execution time. The experimental result demonstrates that the GDACO technique is able to improve the coverage rate of software faults and also increases the average rate of test suite reduction when compared to state-of-the-art-works.


INTRODUCTION
Software Testing is the process of testing the functionality of a software program through analytical methods. Test cases play a vital role in the process of software testing for determining the software quality. Therefore, test suites are generated with aid of test cases for testing software programs. Most of research works has been designed for test suite generation and optimization. But, optimizing the coverage capability of test suite generation was not sufficient.
A Memetic Algorithm was designed in [1] for whole test suite generation and optimization. However, coverage capability was not ensured. Test case minimization approach was developed in [2] to reduce the number of test cases in configuration-aware structural testing. However, this approach takes more computational time for achieving test suite minimization.
The efficiency of test suites generated was analyzed in [3] to fulfill four coverage criteria with aid of counter example-based test generation and a random generation approach. But, code coverage does not ensure test quality. Cuckoo Search Algorithm was designed in [4] to systematically reduce the number of test cases through considering the combinations of inputs.
A novel approach called whole test suite generation was developed in [5] for test data generation that covers all coverage goals and simultaneously reduces the total size of test case. Though, optimizing coverage capability of test suite was remained unsolved. An intelligent search-based method was intended in [6] to generate test cases automatically for coverage-oriented soft-ware testing. This method provides better performance in terms of test coverage and the number of test cases. But, test coverage was not at required level.
A Tabu Search hyper-heuristic strategy was presented in [7] for t-way test suite generation. However, test suite optimization was remained unaddressed. A test-suitegeneration approach was developed in [8] for efficiently achieving complete multi-goal test-coverage of product-line implementations. However, optimization criteria for ordering of test goals were not considered.
A Parallel Genetic Algorithm Based on Spark was employed in [9] for Pairwise Test Suite Generation and to reduce test suite size. But, finding a smaller test suite size was remained unsolved. A novel regression test selection approach was designed in [10] based on analysis of state and dependence models of components to generate a regression test suite. However, execution time was more. A different techniques designed for test suite minimization was analyzed in [11] for enhancing the testing process of test suites and achieving all the testing requirements.
Based on the above mentioned techniques and methods presented, Greedy Discrete Ant Colony Optimization (GDACO) technique is developed. The research objective of GDACO technique is formulated as follows, • To optimize the coverage capability of test suite generation, GDACO technique is introduced. • To generate the test suites for testing software programs, Ant Colony Optimization (ACO) algorithm is used in GDACO technique. • To perform test suite reduction, Greedy Discretization Algorithm is employed in GDACO technique. The rest of this paper is organized as follows. Section II explains Greedy Discrete Ant Colony Optimization (GDACO) technique with the assist of architecture diagram. Section III and Section IV explains the experimental settings and details performance analysis with the aid of parameters. Section V describes the related works. Finally, Section VI concludes this paper.

A GREEDY DISCRETE ANT COLONY OPTIMIZATION TECHNIQUE
In software engineering, a test suite contains set of test cases for testing a software program to prove that it has some specific set of characteristics.  As shown in Figure 1, GDACO technique takes schoolmate data set as input. Next, GDACO technique applies Ant Colony Optimization algorithm for generating test suites. Then, GDACO technique used Greedy Discretization Algorithm for test suite minimization which resulting in minimal cardinality subset of test suites for finding the more number of faults in schoolmate data set. Therefore, GDACO technique increases average rate of test suite reduction with higher coverage rate of faults. The detailed explanation about GDACO technique is described in forthcoming sections.

A. Test Suite Generation Using Ant Colony Optimization
The GDACO technique used Ant Colony Optimization (ACO) algorithm to generate a set of test suites for achieving high fault coverage in a short time. Ant colony optimization (ACO) is a metaheuristic which discovers solutions for complex combinatorial optimization problems. The ACO algorithm is depends on the behavior of real ants for determining the shortest path from the nest to the food source and back to the nest through putting a chemical substance termed pheromone. Foragers can follow the trail to food determined by other ants trail through sensing pheromone. Therefore, a shortest route is identified for food source. By using this ACO algorithmic process, GDACO technique selects the test cases from the test case set in order to generate a test suites for testing the software programs based on user test requirements. The following diagram shows the Test Suite Generation process using Ant Colony Optimization.

Figure 2. Test Suite Generation Process Using Ant Colony Optimization
As shown in Figure 2, Ant Colony Optimization algorithm initially takes test cases set as input and then initializes the ACO parameters. In ACO algorithm, Ant selects the test cases for generating test suites based on trail's probability and subsequently update pheromone trails. This process is repeated until the maximum iteration is found. Finally, ACO algorithm finds best solutions of iterations in order to form the test suites.
For generating the test suites, ACO algorithm initially constructs a directed graph G= (V, E) in which V represents the vertices (i.e. test cases) and E denotes the edges between the two vertices. Each edge e ∈ E is allocated with a weight which signifies the amount of pheromone that an ant may deposit on track with the primary weights assigned to 1 as shown in Figure 2. The graph is traversed through the ants based on a probabilistic approach where each crossing results in generation of test suites for evaluating the software quality. A graph structure of ACO algorithm for Test Suite Generation is shown in below Figure 3.

Figure 3. Graph Structure of ACO Algorithm for Generating Test Suites
As shown in Figure 3, during each level of graph traversal, an ant has to select between two vertices that correspond to the identical initial input. But, each vertex corresponds to a dissimilar input value possibility such as 0 or 1. An ant selects one of the two vertices based on the trail's probability by using following mathematical formula, From the equation (1), represents the weight of the candidate trail and denotes the weight of the alternate trail. Let assume is the intensity of trail on edge at time . An ant picks the next initial input depends on time . The pheromone trails are updated by using following mathematical equation, (2) From the equation (2), is a coefficient that indicates the trail's probability between and . An ant uses the pheromone trail to compute the probability of choosing as the next vertex when at a vertex by using following mathematical equation, From the equation (3), the probability of choosing the next vertex (i.e. test case) is determined. The above process is repeated until the maximum iteration is found. The algorithmic process of ACO Algorithm for generating test suite is shown in below,

Step 8: End while
Step 9: Return the best solution found Step 10: End

Algorithm 1 ACO Based Test Suite Generation
As shown in algorithm 1, Ant initially chooses the test cases for generating test suites based on trail's probability and subsequently update pheromone trails. During each level of the graph traversal, the ants find out the probability of choosing a vertex through creating a random number x. If x is less than ρ then the ant select the vertex to the current trail (i.e. choose test case for generating test suites) and the vertex value is retained. Otherwise, the adjacent vertex is select. This process is repeated until all the ant traverses all graph levels using the same procedure. The vertices selected for each level of the graph traversal is collected together in order to generate a test a suites.

B. Test Suite Reduction Using Greedy Discretization Algorithm
The GDACO technique used Greedy Discretization algorithm for determining the optimal solution to the test suite reduction problem. The Greedy Discretization algorithm repeatedly removes the test case which unsatisfied user test requirements from the test suite set T until all the requirements are covered. The following diagram shows the process of Greedy Discretization algorithm for obtaining minimal cardinality subset test suites.
From the equation (4)  The TR table with rows and columns, it is essential for choosing the subset of rows to cover all of the columns in the matrix with minimal execution time. The GDACO technique used greedy discretization algorithm for removing the redundant test cases in different test suites. Let us consider 5 test suites with 9 test cases as shown in Figure  5.

Figure 5. Example of Greedy Discretization Algorithm Process for Test Suite Reduction
As shown in Figure 5, the greedy discretization algorithm iteratively chooses test cases which covers maximum test requirements until all the requirements are fulfilled. Consequently, greedy discretization algorithm removes the test cases which are redundant and unsatisfied test requirements. From the Figure 5, greedy discretization algorithm picks the test suites T1, T2 and T3 as a minimal cardinality subset test suites which covers the all test requirements of software program. As a result, GDACO technique achieves higher coverage rate. The algorithmic process of greedy discretization algorithm for test suite minimization is shown in below,

Algorithm 2 Greedy Discretization Based Test Suite Minimization
By using the above algorithmic process, GDACO technique acquires minimal cardinality subset of test suites which covers maximum test requirements. This helps for achieving higher coverage rate of faults in software program.

EXPERIMENTAL SETTINGS
In order to evaluate the efficiency of proposed, Greedy Discrete Ant Colony Optimization (GDACO) technique is implemented in Java Language by using schoolmate data set. The GDACO technique employed schoolmate data set for discovering faults in software programs in order to increase software quality. This schoolmate data set consists of many PHP program. The performance of GDACO technique is measured in terms of coverage rate, average rate of test suite reduction and execution time.

RESULT AND DISCUSSIONS
In this section, the result analysis of GDACO technique is estimated. The effectiveness GDACO technique is compared against with two methods namely Memetic Algorithm [1] and Test case minimization approach [2] respectively. The efficiency of GDACO technique is evaluated along with the following metrics with the help of tables and graphs.

A. Measurement of Average Rate of Test Suite Reduction
The average rate of test suite reduction measures the ratio of number of test suites reduced using GDACO technique to the total number of test suites taken as input. The average rate of test suite reduction is measured in terms of percentage (%) and mathematically formulated as, From the equation (5), average rate of test suite reduction is measured. While the average rate of test suite reduction is higher, the method is said to be more efficient.   Figure 6 describes the impact of average rate of test suite reduction with respect of different number of test suites. As illustrated in figure, the proposed GDACO technique is provides better average rate of test suite reduction when compared to existing Memetic Algorithm [1] and Test case minimization approach [2]. Besides, while increasing the number of test suite, the average rate of test suite reduction is also gets increased using all three methods. But comparatively, the average rate of test suite reduction using proposed GDACO technique is higher. This is because of application of Greedy Discretization Based Test Suite Minimization in GDACO technique. With aid of Greedy Discretization algorithm, proposed GDACO technique picks test cases which cover more test requirements until all the requirements are fulfilled and consequently eliminates the test cases which are redundant and unsatisfied test requirements. This in turn helps for improving the average rate of test suite reduction in an effective manner. Therefore, proposed GDACO technique increases the average rate of test suite reduction by 21% when compared to Memetic Algorithm [1] and 8% when compared to Test case minimization approach [2] respectively.

B. Measurement of Coverage Rate
In GDACO technique, coverage measures the rate at which a maximum number of faults covered by a reduced test suites form the total number of test suites generated. The average coverage rate (CR) is measured in terms of percentages (%) and mathematically formulated as, From the equation (6), coverage rate of test suites is measured. While the coverage rate is higher, the method is said to be more efficient.  Table 3 shows the comparative result analysis of coverage rate is obtained based on different number of test cases using three methods. The GDACO considers the framework with diverse number of test suites in range of 10-100 for conducting experimental works using Java Language. Form the table value, it is illustrative that the coverage rate using GDACO technique is higher as compared to existing Memetic Algorithm [1] and Test case minimization approach [2].  Figure 7 depicts the impact of coverage rate with respect of diverse number of test suites. As exposed in figure, the proposed GDACO technique is provides better coverage rate for discovering the more faults in software program when compared to existing Memetic Algorithm [1] and Test case minimization approach [2]. In addition, while increasing the number of test suite, the coverage rate is also gets increased using all three methods. But comparatively, the coverage rate using proposed GDACO technique is higher. This is owing to application of Greedy Discretization Based Test Suite Minimization in GDACO technique. By using Greedy Discretization algorithm, proposed GDACO technique chooses test cases which cover maximum test requirements until all the requirements are satisfied. This in turn assists for improving the coverage rate of faults in an effectual manner. As a result, proposed GDACO technique increases the coverage rate by 26% when compared to Memetic Algorithm [1] and 12% when compared to Test case minimization approach [2] respectively.

C. Measurement of Execution Time
In GDACO technique, the execution time measures the amount of time taken for generating the test suites. The execution time (ET) is measured in terms of milliseconds (ms) and mathematically formulated as, From the equation (7), execution time of test suites generation is measured. While the execution time is lower, the method is said to be more efficient.  Table 4 shows the result analysis of execution time with respect to different number of test suites in range of 10-100 using three methods. From the table value, it is expressive that the execution time of test suite generation using GDACO technique is lower when compared to existing Memetic Algorithm [1] and Test case minimization approach [2].  Figure 8 demonstrates the impact of execution time with respect of dissimilar number of test suites using three methods. As shown in figure, the proposed GDACO technique is provides better execution time for generating test suites when compared to existing execution time. As well, while increasing the number of test suite, the execution time is also gets increased using all three methods. But comparatively, the execution time using proposed GDACO technique is lower. This is due to application of ACO based Test Suite Generation in GDACO technique in which ant chooses the test cases for generating test suites based on trail's probability. The vertices selected during each level of the graph traversal are collected together order to generate a test a suites with lower time. This in turn supports for reducing the execution time in a significant manner. Thus, proposed GDACO technique reduce the execution time of test suite generation by 36% when compared to Memetic Algorithm [1] and 30% when compared to Test case minimization approach [2] respectively.

RELATED WORKS
Multiple coverage criteria was applied in [12] for efficient test Suite minimization and improving the capability of fault detection. A novel method was designed in [13] that remove the test case redundancy with aid of fuzzy clustering technique and provides good results for conditions/path coverage. But, time complexity taken for removing the redundancy was higher.
A Hierarchical Clustering Approach was presented in [14] for test suite minimization in which a branch coverage criterion is selected as the code coverage criteria in order to reduce the test suite. However, a reduced test suite does not cover more faults. A genetic algorithm was used in [15] to decrease the test case in regression testing. This genetic algorithm reduces test cost of regression testing and enhances the efficiency of the software with the optimized test suite.
A data mining-based algorithm was presented in [16] in which concept of maximal frequent item set mining is used for test suite reduction. A novel technique was designed in [17] to lessen the size of test suite by using improved precision slices. But, average rate of test suite reduction was lower. An effective strategy was intended in [18] to generate a balanced test suite for spectrum-based fault localization. But, coverage rate was lower.
A model-based approach was designed in [19] to lessen the amount of fault detection rate in the test suite generation. Test case classification was performed in [20] using tuned fuzzy logic for test suite reduction. However, it takes more execution time for reducing test suites.

CONCLUSION
An efficient Greedy Discrete Ant Colony Optimization (GDACO) technique is developed with the objective of improving the coverage capability of test suite generation. The GDACO technique initially generates the test suites by using ACO algorithm. The ACO algorithm chooses test cases from test cases set depends on trail's probability and consequently update pheromone trails in order to generate test suites. Afterwards, GDACO technique performs test suite optimization with assists of Greedy Discretization algorithm. The Greedy Discretization algorithm selects the test cases which cover test requirements and subsequently eliminates redundant test cases in test suites. As a result, GDACO technique finally gets minimal cardinality subset of test suites with higher coverage rate for identifying the faults in software programs. This in turn assists for reducing the testing cost of software program. The efficiency of GDACO technique is test with the metrics such as coverage rate, average rate of test suite reduction and execution time. With the experiments conducted for GDACO technique, it is observed that the coverage rate provided more accurate results for improving software quality when compared to state-of-the-art works. The experimental results show that GDACO technique is provides better performance with an improvement of average rate of test suite reduction with higher coverage rate when compared to the state-of-the-art works