FiDoop: An Interactive GUI to Identify Frequent Items Using Map Reduce

Raksha D; P Hari Prasad Reddy; Mukesh P U; Prof. Raghavendra Reddy

doi:10.26483/ijarcs.v9i0.6244

PDF

Published: Aug 8, 2018

DOI: https://doi.org/10.26483/ijarcs.v9i0.6244

Keywords:

FiDoop, MapReduce, Frequent itemset mining

Raksha D

P Hari Prasad Reddy

Mukesh P U

Prof. Raghavendra Reddy

Abstract

Due to an exponential increase of real-time data monitoring systems, the extraction of frequent itemset from the large database is a challenging task. Memory usage and excessive runtime for less amount of data, automatic parallelization are the limitations in existing algorithms of frequent itemsets. FiDoop based itemset algorithm is introduced by using MapReduce framework to overcome this problem. This system includes activities such as data uploading, preprocessing, threshold, find support and confidence, merge and result. We implement FiDoop on our in-house Hadoop cluster. To improve FiDoopâ€™s performance a workload balance matric is used to measure load balancing across the cluster's computing node is developed. Initially, data is selected from the dataset and uploaded to the server, then the preprocessing stage removes columns which contain unwanted entries. The information is analyzed and partitioned to compute threshold value. Finally, frequent itemsets are merged to acquire frequent pattern. This proposed system is mainly developed for improving accuracy and is evaluated based on the performance measures.

Downloads

Download data is not yet available.

Issue

2018: Volume 9 Special Issue No. 3, May 2018

Section

Articles

COPYRIGHT

Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
The journal allows the author(s) to retain publishing rights without restrictions.
The journal allows the author(s) to hold the copyright without restrictions.

References

Y.-J. Tsay, T.-J. Hsu, and J.-R. Yu, â€œFIUT: A new method for mining frequent itemsets,â€ Inf. Sci., vol. 179, no. 11, pp. 1724 â€“ 1737, 2009.

D. Chen et al., â€œTree partition based parallel frequent pattern mining on shared memory systems,â€ in. 20th IEEE Int. Parallel Distrib. Process. Symp. (IPDPS), Rhodes Island, Greece, 2006, pp. 1â€“ 8. [3] K.-M. Yu, J. Zhou, T.-P. Hong, and J.-L. Zhou, â€œA load-balanced Distributed parallel mining algorithm,â€ Expert Syst. Appl., vol. 37, no. 3, pp. 2459 â€“ 2464, 2010. [4] E.-H. Han, G. Karypis, and V. Kumar, â€œScalable parallel data mining for association rules,â€ IEEE Trans. Knowl. Data Eng., vol. 12, no. 3, pp. 337 â€“ 352, May/Jun. 2000. [5] L. Zhou et al., â€œBalanced parallel FP-growth with MapReduce,â€ in Proc. IEEE Youth Conf. Inf. Comput. Telecommun. (YC-ICT), Beijing, China, 2010, pp. 243 â€“ 246. [6] K. W. Lin, P.-L. Chen, and W.-L. Chang, â€œA novel frequent pattern mining algorithm for very large databases in cloud computing. [7] S. Hong, Z. Huaxuan, C. Shiping, and H. Chunyan,â€œThe study of improved FP-growth algorithm in MapReduce,â€ inProc. 1st Int. Workshop Cloud Comput.Inf. Security, Shanghai, China, 2013, pp. 250 â€“ 253 [8] M.-Y. Lin, P.-Y. Lee, and S.-C. Hsueh, â€œAprioribased frequent itemset mining algorithms on MapReduce,â€ in Proc. 6th Int. Conf. Ubiquit. Inf. Manage. Commun.(ICUIMC), Danang, Vietnam, 2012,pp.76:1â€“ 76:8. [9] L. Liu, E. Li, Y. Zhang, and Z. Tang, â€œOptimization of frequent itemset mining on multiple-core processor,â€ in Proc. 33rd Int. Conf. Very Large Data Bases, Vienna, Austria, 2007, pp. 1275 â€“ 1285. [10] A. Javed and A. Khokhar, â€œFrequent pattern mining onmessage passing multiprocessor systems,â€ Distrib.Parallel Databases, vol. 16, no. 3, pp. 321 â€“ 334, 2004. [11] J. Dean and S. Ghemawat, â€œMapReduce: A flexible data processing tool,â€ Commun. ACM, vol. 53, no. 1, pp. 72 â€“ 77, Jan. 2010.

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

References

Most read articles by the same author(s)