HPAARM: Hybrid Parallel Algorithm for Association Rule Mining

Prathyusha Kanakam, S Radha Krishna

Abstract


Data mining is one of the vast areas of research and nowadays the research is going on the most important techniques for decision making processing in data mining. Discovering patterns or frequent episodes in transactions is an important problem in data mining for the purpose of inferring rules from them. So, mining association rules is considered as powerful technique in the data mining process. The problem of mining association rules is composed of finding the large itemsets and to generate the association rules from these itemsets. To find the large itemsets, the dataset must be scanned many times. Many algorithms have been developed to increase the performance of mining association rules through reducing the number of scans over the dataset. In this paper, we aims to enhance and optimize the process even further by developing techniques to reduce the number of database scans to just only once. To deal with the huge size of the data, we have designed a parallel algorithm for reducing both the execution time and the number of scans over the database, in order to minify I/O overheads as much as possible. In this paper, we introduce some approaches for the implementation of two basic algorithms for association rules discovery (namely Apriori and Eclat). Our approach combines efficient data structures (Radix Trees) to code different key information (line indexes, candidates). We also introduced different types of efficient data structures and their merits and de-merits of using them in deducting association rules.

 

Keywords: Data mining; Patterns, Association Rules; Parallel Algorithm Item Sets; Apriori; Eclat; Radix Trees; Line Indexes;


Full Text:

PDF


DOI: https://doi.org/10.26483/ijarcs.v4i9.1849

Refbacks

  • There are currently no refbacks.




Copyright (c) 2016 International Journal of Advanced Research in Computer Science