SOFTWARE FAULT PREDICTION USING CASE-BASED REASONING: A COMPARATIVE ANALYSIS

: Software fault estimation is important to increase the software reliability. Therefore, increasing the software reliability tends to increase the software quality. For testing the quality of software module, I have used four established similarity functions namely Euclidean method, Canberra method, Exponential method and a Manhattan method. The selection of a particular similarity measure may affect the performance precision of a CBR-Based fault prediction. It has been observed that, all the distance functions perform nearly the equal for the same data set which indicates efficiency of indigenous tool.


INTRODUCTION
Nowadays software fault prediction became crucial for increasing the software reliability. More software reliability provides better software quality. Faults are defects that results in software failure and unnecessary increase the testing costs. Faults or defects in software modules are becoming biggest challenge and it needs to be resolved. Software quality assurance is major concerns in modern era. Many Software companies are accepting that with faults or defects lack quality. Therefore, company needs a methodology which can remove the faults at the early stage of software development process which reduces the testing cost and development cost as well. Various machine learning techniques have been applied for software fault prediction such as support vector machine, neural network, and genetic algorithm and many more. We have used casebased reasoning as a method for finding the errors or fault in software module and this is the novelty of our paper.

BACKGROUND AND RELATED WORKS
Besides the machine learning methods discussed so far, many techniques have been proposed for software fault prediction. I especially observed that most of the models reported in the survey used as previous fault data [16]. Many researchers have used AI-Based approach like Case-Based Reasoning (CBR), Genetic Algorithm (GA), Neural Network (NN), and many more. Khan et al. [6] mentioned that, when software quality was predicted, the main objective was to predict reliability and stability of the software. Becker et al. [7] was predicted performance of the software. Zhong et. al in [4] has used unsupervised Learning techniques to build a software quality estimation system. Case-based reasoning has also been used by Kadoda et. al in [1]. Myrtveit et al in [2] and Ganesan et. al in [3] have also studied CBR was applied to software quality modeling of a family of complete industrial software systems and the accuracy is measured better than a corresponding multiple linear regression model in predicting the number of design faults. Aamodt and Plaza are given the case-based reasoning cycle [9]. Rashid et. al emphasized on the importance of software quality prediction and accuracy of case-based estimation model [5] [8] [14][15].

SOME IMPORTANT CATEGORIES OF MACHINE LEARNING TECHNIQUES
In this section I have discussed about different learning methods such as Artificial Neural Network (ANN), Genetic Algorithm (GA) and Case-Based Reasoning (CBR). Applying machine learning techniques always useful for improving the efficiency of CBR systems. A. Artificial Neural Network (ANN): Artificial Neural Network can be applied for enhancing the efficiency of case-based reasoning system. In general ANN produces the test cases for software testing. [11]. Therefore, recommendations of using neural networks are very much application-dependent.
B. Genetic Algorithm (GA): Genetic algorithm is searchedbased algorithm and it is also based of biological pattern or symbol. Genetic algorithm has been developed by John Holland and his colleagues for the goals of their research [12]. The view is learning as a competition among a population of evolving candidate problem solutions. A robustness function evaluates each solution to decide whether it will supply to the subsequent generation of solutions. Then, through operations analogous to gene transfer in sexual imitation, the algorithm creates a new population of candidate solutions [13].

C. Case-Based Reasoning (CBR):
Case-based reasoning (CBR) was first dignified in the year 1980s from the effort of Schank and others on memory [10].Case-Based reasoning is one of the most popular machine learning techniques. CBR is a problem solving paradigm that is fundamentally different from other major AI approaches, in that instead of relying solely on general knowledge of a problem domain it uses specific cases [1]. As we know case-based reasoning works on past cases or experiences. The objective is to find the group of known cases that matches the new case at best and retaining the new experience by adding in it into the existing knowledge-base (case-base) for future solution. As indicated in the figure 1.

I. APPLICATION OF DISTANCE FUNCTIONS
In this section, I have applied four distance functions for finding the error level or faults in a software module by using the case-based reasoning methodology. All four distance functions are weighted. As indicated in table 1.

METHODOLOGY
The primary data used in this paper which is collected from B.tech students of computer science and engineering from the college campus. All the students got the program in the form of assignments which has been written in high level language. Based on students data I calculated error with six parameters using case-based reasoning. All six metrics which is used in this paper given below.

CONCLUSIONS
While applying machine learning method especially casebased reasoning is often useful for improving the results.
Because it applies to all knowledge containers. In this research paper I have used four similarity measures function for comparing the results while applying the case-based reasoning method. In contrast to other studies which emphasized on NN. Except for the exponential method, all the distance function performs nearly the same. This shows the importance of distance functions while calculating the fault in a software module. Prediction of faults or errors in software using case-based reasoning by indigenous tool is the novelty of the paper and this is the need of the research scholar.