A RESEARCH REVIEW ON COMPARATIVE ANALYSIS OF DATA MINING TOOLS, TECHNIQUES AND PARAMETERS

Main Article Content

Anil Sharma
Balrajpreet Kaur

Abstract

Data mining is a process of exploring unexplored patterns from huge databases. This acts as a key to knowledge discovery which provides a great support to business world and academia. To make this knowledge discovery happening various data mining tools are developed. These tools provide interface to get data and to retrieve some interesting patterns out of it which are further useful to attain new knowledge. There are variety of parameters defined in the literature which provide base for a tool to perform analysis and different tools are available to perform these analysis. This is quite interesting to perform a comparative analysis of these tools and to observe their behavior based on some selected parameters which will further be helpful to find the most appropriate tool for the given data set and the parameters. In this paper authors did experiment with two different datasets on WEKA tool based on six parameters which illustrate disparity in the value with the type of dataset namely balanced and unbalanced.

Downloads

Download data is not yet available.

Article Details

Section
Articles
Author Biographies

Anil Sharma, Lovely Professional University

Department of computer science and application

Balrajpreet Kaur, Lovely Professional University

Department of computer science and application

References

H. Jiawei , M. Kamber, J. Pei, Data mining concepts and techniques, 3rd ed., Morgan Kaufmann Elsevier: USA , 2012.

I. H.Witten, E. Frank, M. A.Hall, Data Mining practiced machine learning tools and techniques, 3rd ed., Morgan Kaufmann Elsevier: USA,2011.

12 data mining tools and techniques [Online]. Available: https://www.invensis.net/blog/data-processing/12-data-mining-tools-techniques [ Cited 2015 November 18].

OLAP Tools (Online Analytical Processing)[Online]. Available :http://www.informationbuilders.com/olap-online-analytical-processing-tools

10 most popular analytic tools in business[Online]. Available from:http://analyticstraining.com/2011/10-most-popular-analytic-tools-in-business .[Cited 2011 January 15].

Defining dashboards, visual analysis tools and other data presentation media[Online]. Available from:http://www.dashboardinsight.com/articles/digital-dashboards/fundamentals/what-is-a-dashboard.aspx .[Cited 2011 November 28].

Enterprise Dashboard Digest[Online].Available from: http://enterprise-dashboard.com

Building and Using Dashboards[Online].Available from: https://docs.oracle.com/cd/E28280_01/bi.1111/e10544/dashboards.htm#BIEUG682

What is Apache Mahout[Online]. Available from: https://mahout.apache.org/

Teacher Dashboard[Online].Available from: http://www.teacherdashboard365.com/

Orange: Data mining Fruitful and Fun[Online].Available from: http://orange.biolab.si/

Natural language Toolkit[Online].Available from: http://www.nltk.org/

Voyant [Online] . Available from: http://voyant-tools.org/

Alchemy API Tools[Online].Available from: http://www.alchemyapi.com/developers/tools

Alchemy Language[Online].Available from: https://www.ibm.com/watson/developercloud/alchemy-language.html

A. Stavrianou, P. Andritsos, N. Nicoloyannis, Overview and Semantic Issues of Text Mining, SIGMOD Record.2007 September

Introduction to Natural Language Processing[Online] Available from:http://blog.algorithmia.com/introduction-natural-language-processing-nlp.[Cited 2016 August 11].

Predictive Analytics [Online].Available from:http://www.predictiveanalyticstoday.com/top-software-for-text-analysis-text-mining-text-analytics/

Decision Tree[Online].Available from: https://www.mindtools.com/dectree.html

6 easy steps to learn Naive Bayes Algorithm[Online].Available from: https://www.analyticsvidhya.com/blog/2015/09/naive-bayes-explained/

D. Kroese , J. Chan, "Generalized Linear Models," Springer,2013.

P. Lad, A. Somani, K.E. Krishnan, A. Gupta and V. Kartik," High-Throughput Shape Classiï¬cation Using Support Vector Machine," IEEE.2016.

Confusion Matrix[Online].Available from: http://www2.cs.uregina.ca/~dbd/cs831/notes/confusion_matrix/confusion_matrix.html

R. Kumar and R.Verma ,"Classification Algorithm for data mining :A survey,"IJIET,2012.

G.Keseavaraj, S.Sukumaran,"Study on classification techniques on data mining," 4th ICCCNT ,IEEE, 2013.

M.Rathi,"Regression modeling technique o data mining for prediction," ICT ,Springer,2010.

S.Gupta,"A regression modeling technique on data mining. International journal of computer Application",2015 April.

D.Singh and A.Gosain ,"A comparative analysis of distributed clustering Algorithm : A survey," International symposium on computational Business Intelligence, IEEE,2013.

M. Hu and B.Liu,"Mining and summarizing customer reviews," KDD-04 tenth ACM SIGKDD International conference on knowledge discovery and data mining,ACM,2004.

Top 10 challenging problems in Data mining[Online].Available from: http://www.dataminingblog.com/top-10-challenging-problems-in-data-mining/

A.Kumar, AK. Tyagi and SK. Tyagi,"Data mining: Various issues and challenges for future," IJETA,2014

H.Nasereddin," NEW TECHNIQUE TO DEAL WITH DYNAMIC DATA MINING IN THE DATABASE," IJRRAS,.December 2012.

DK. Singh, V.Swaroop,"Data Security and Privacy in Data Mining: Research Issues & Preparation. International Journal of Computer Trends and Technology,"2013.

Shuang, Cong. "the Neural Network Theory and Application by Matlab Tool Box [M]." Hefei: Publishing Company of University of Science and Technology of China .

M.Hall, E.Frank , G.Holmes, B.Reutemann , IH Witten,"The WEKA Data Mining Software: An Update," SIGKDD Explorations,2009.

https://weka.wikispaces.com/Optimizing+parameters

J.Demšar and B.Zupan,"Orange: Data Mining Fruitful and Fun - A Historical Perspective",2012

M.Berthold, N.Cebron, F.Dill, T.Gabriel, T.Kotter, T.Meinl, P.Ohl, C.Sieb, K.Thiel and B.Wiswedel,"KNIME: The Konstanz Information Miner,"Springer,2008.

E.Loper and S.Bird ,"NLTK: The Natural Language Toolkit,"2002.

Z.Haofeng,"RapidMiner: A Data Mining Tool Based on Association Rules," Springer,2001.

A.Kusiak,"Rough set theory: A data mining tool for semiconductor manufacturing," JANUARY,2001.

J.Alcalá-Fdez,"KEEL: a software tool to assess evolutionary algorithms for data mining problems,"Springer,2008.

S.Christa, K.Madhuri, V Suma," A Comparative Analysis of Data Mining Tools in Agent Based Systems,"2010.

G.Smith , J.Whitehead, M.Mateas,"Tanagra: A Mixed-Initiative Level Design Tool,"ACM, 2010

R.Mikut and M.Reischl,"Data mining tools. Research gate,"2011.

Shelly,"Performance Analysis of various data mining classification Technique on healthcare data,"2011.

A.Wahbeh.,"A Comparison Study between Data Mining Tools over some Classification Methods," International Journal of Artificial Intelligence,2012

D.Jain,"A Comparison of Data Mining Tools using the implementation of C4.5 Algorithm ,"International Journal of Science and Research Vol3,2014.

Salma ,"Rule based complaint detection using Rapid Miner," RCOMM; 2013,Volume: 141 - 149,2013.

R.Arun and J.Tamilselvi,"Data Quality and the Performance of the Data Mining Tool",2015.

H.Odan, A.Daraiseh,"Open source Data Mining Tools," IEEE,2015.

C.Shah, A.Jivani,"Comparison of data mining classification algorithms for breast cancer prediction,"4th ICCCNT ,IEEE,2013.

P.Kakkar, A.Parashar," Comparison of different clustering Algorithm using WEKA tool," International Journal of Advanced Research in Technology, Engineering and Science, 2014.

S.Bavisi, Ȧ.J and L.Lopes,"A Comparative Study of Different Data Mining Algorithms,"International Journal of Current Engineering and Technology,2014

P.Gonc ¸ Jr. A, R.Barros and D.Vieira," On the use of data mining tools for Data preparation in classification problems," ACIS 11th International Conference on computer and information science ,IEEE ,2012.

N.Chauhan and N.Gautam," Parametric comparison of data mining tools," IJATES,2015.

A.Gupta, N.Chetty , S.Shukla,"A classification method to classify High Dimensional data",IEEE,2015.

M.Hassan , ME.Shahab , EMR.Hamed.,"A comparative study of classification algorithm in E-health Environment," IEEE.2016.

S.Singh, Y.Liu, W.Ding and Z.Li,"Evaluation of data mining tools for Telecommunication Monitoring Data using design of experiment," IEEE ,2016.

Information of dataset[Online].Available from https://archive.ics.uci.edu/ml/datasets/iris

WEKA dataset [Online].Available from www.cs.waikato.ac.nz/ml/weka/datasets.html