Defect Prediction by Pruning Redundancy in Association Rule Mining

Main Article Content

Amarpreet Kaur


Defect prediction is a major problem during software maintenance and evolution. It is important for the software developers to identify defective software modules to improve the software quality. Many organizations want to predict the defects in software systems, before they are deployed, to improve and measure the quality of software. Different researchers proposed various approaches to extract the defect-prone modules in the specific software system. This paper focuses on an effective model, called Apriori, which uses the approach of association rule mining. Association rule mining remains a very popular and effective method to extract meaningful information from a large data set. Apriori algorithm is based on the discovery of association rules for predicting whether a software module is defective or not. Different algorithms perform in a different manner on distinct datasets. This paper analyzes the shortcomings of Apriori algorithm and studies the improvement strategies to improve the performance of Apriori algorithm by removing the redundancy of rules generated on the basis of different parameters. In this paper, we use a new method to find the best ‘n’ association rules out of the pool of ‘k’ association rules based on heuristic analysis. This study will help improve the existing software defect prediction models in terms of precision, performance and other aspects.


Download data is not yet available.

Article Details

Author Biography

Amarpreet Kaur, Central University of Punjab

Research Scholar, Computer Science and Technology


X. Amatriain, A. Jaimes, N. Oliver, and J. M. Pujol, Data Mining Methods for Recommender Systems. 2011.

R. Agrawal, “Mining Association Rules between Sets of Items in Large Databases,†no. May, pp. 1–10, 1993.

P. He, “The Research of improved Association Rules Mining Apriori,†Proc. - 3rd Int. Conf. Converg. Hybrid Inf. Technol. ICCIT 2008, no. August, pp. 0–2, 2004.

R. B. Diwate and A. Sahu, “Data Mining Techniques in Association Rule : A Review,†vol. 5, no. 1, pp. 227–229, 2014.

V. Mangla, C. Sarda, and T. Nadu, “Improving the efficiency of Apriori Algorithm in Data Mining,†Int. J. Eng. Innov. Technol., vol. 3, no. 3, pp. 393–396, 2013.

L. C. Briand, V. R. Basili, and W. M. Thomas, “A pattern recognition approach for software engineering data analysis,†IEEE Transactions, vol. 18, no. 11. pp. 931–942, 1992.

X. Deng and X. Wang, “Mining Rank-Correlated Associations for Recommendation Systems,†IEEE, no. 062112065, pp. 625–629, 2009.

R. Agrawal and R. Srikant, “Fast algorithms for mining association rules,†Proceedings of the 20th International Conference on Very Large Databases. pp. 487–499, 1994.

J. Tian, “An empirical comparison and characterization of high defect and high complexity modules,†vol. 67, pp. 153–163, 2003.

A. Bhandari, A. Gupta, and D. Das, “Improvised Apriori Algorithm Using Frequent Pattern Tree for Real Time Applications in Data Mining,†Procedia Comput. Sci., vol. 46, no. Icict 2014, pp. 644–651, 2015.

S. Rathee, M. Kaul, and A. Kashyap, “R-Apriori: An Efficient Apriori based Algorithm on Spark,†ACM, pp. 27–34, 2015.

A. H. Yousef, “Extracting software static defect models using data mining,†Ain Shams Eng. J., vol. 6, no. 1, pp. 133–144, 2014.

R. Mishra, “Comparative Analysis of Apriori Algorithm and Frequent Pattern Algorithm for Frequent Pattern Mining in Web Log Data .,†Int. J. Comput. Sci. Inf. Technol., vol. 3, no. 4, pp. 4662–4665, 2012.

T. M. Khoshgoftaar and N. Seliya, “Software Quality Classification Modeling Using The SPRINT Decision Tree Algorithm Taghi,†pp. 365–374, 2002.

T. M. Khoshgoftaar, B. Raton, and R. M. Szabo, “An Application of Zero-Inflated Poisson Regression for Software Fault Prediction,†pp. 66–73, 2001.

G. Czibula, Z. Marian, and I. G. Czibula, “Software defect prediction using relational association rule mining,†Inf. Sci. (Ny)., vol. 264, pp. 260–278, 2014.

J. Leskovec, Mining of Massive Datasets. 2014.

M. Zhang and C. He, “Survey on Association Rules Mining Algorithms 2 Basic Principles of Association Rules,†pp. 111–118, 2010.

B. Goethals, “Survey on Frequent Pattern Mining,†pp. 1–43, 2003.

S. Veeramalai, N. Jaisankar, and A. Kannan, “Efficient Web Log Mining Using Enhanced Apriori Algorithm with Hash Tree and Fuzzy,†vol. 2, no. 4, pp. 60–74, 2010.

R. Karthik and N. Manikandan, “Defect association and complexity prediction by mining association and clustering rules,†ICCET 2010 - 2010 Int. Conf. Comput. Eng. Technol. Proc., vol. 7, pp. 569–573, 2010.

S. Deepa and M. Kalimuthu, “An Optimization of Association Rule Mining Algorithm using Weighted Quantum behaved PSO,†vol. 3, pp. 80–85, 2012.

S. Agarwal, “Prediction of Software Defects using Twin Support Vector Machine,†pp. 128–132, 2014.

Q. Wang and B. Yu, “Extract Rules from Software Quality Prediction Model Based on Neural Network,†no. Ictai, pp. 0–2, 2004.

I. Qureshi, J. Ashok, and V. Anchuri, “A Survey on Association Rule Mining Algorithm and Architecture for Distributed Processing,†Int. J. Comput. Sci. Inf. Technol., vol. 5, no. 3, pp. 4674–4678, 2014.

T. M. Khoshgoftaar, “Tree-Based Software Quality Estimation Models For Fault Prediction,†2002.

D. Kumari and K. Rajnish, “A new approach to find predictor of software fault using association rule mining,†Int. J. Eng.Technol., vol. 7, no. 5, pp. 1671–1684, 2015.

J. Manimaran and T. Velmurugan, “Analysing the quality of Association Rules by Computing an Interestingness Measures,†vol. 8, no. July, 2015.

Z. Rong, D. Xia, and Z. Zhang, “Complex statistical analysis of big data: Implementation and application of apriori and FP-growth algorithm based on MapReduce,†Proc. IEEE Int. Conf. Softw. Eng. Serv. Sci. ICSESS, no. 2012, pp. 968–972, 2013.