IMPACT OF DISTANCE METRICS ON THE PERFORMANCE OF K-MEANS AND FUZZY C-MEANS CLUSTERING – AN APPROACH TO ASSESS STUDENT’S PERFORMANCE IN E-LEARNING ENVIRONMENT

Vilas Pandurangji Mahatme, Kishore K Bhoyar

Abstract


Clustering plays a vital role in the various areas of research. In clustering algorithm, distance metrics is key constitute in finding regularities in the data objects. Distance metrics are use as similarity measures. The similarity measures used in clustering are mostly distance based. Distance metrics are not always good enough. Distance metric does not work well when to capture correlations among the data objects. Choosing the right distance metric for a given dataset is a great challenge. In this paper, impact of three different metrics Euclidean, Manhattan and Pearson coefficient correlation on the performance of k-means and fuzzy c-means clustering is presented. In clustering, detection of similarity using distance metrics affects the accuracy of the algorithm. This study helps the researchers to take quick decision about choice of metric for clustering

Keywords


data mining; k-means clustering; fuzzy c-means clustering; distance metrics; e-learning

Full Text:

PDF

References


K. D. Aha, D. and M. K. Albert, “Instance-based learning algorithms,”Machine Learning, vol. 6, pp. 37–66, 1991.

R. D. Short and K. Fukunaga, “A new nearest neighbor distance measure,” Proc. of the 5th IEEE Conf. on Pattern Recognition, pp. 81–86, 1990.

T. Daniel and T. E. Daniel, “Making the nearest neighbor meaningful,” Proc. of Workshop on Clustering High Dimensional Data and its Applications, 2002.

R. Xu and D. Wunsch, “Survey of clustering algorithms,” IEEE Transaction Neural Networks, vol. 16, no. 3, pp. 645– 678 , 2005.

Chaoqun Li and Hongwei Li, “A Survey of Distance Metrics for Nominal Attributes,” Journal of Software, vol. 5, no. 11, pp. 1262-1269, 2010.

Chaoqun Li, Hongwei Li, “Selective Value Difference Metric,”Journal of Computers, vol. 8, no. 9,. pp. 2232-2238, 2013.

Peter Grabusts “The choice of metrics for clustering algorithms,”8th International Scientific and Practical Conference, vol.2, 2011.

Kumar V., Chhabra J.K., Kumar D., “Impact of distance measures on the performance of clustering algorithms,” Intelligent Computing, Networking, and Informatics, Advances in Intelligent Systems and Computing, vol. 243, Springer, 2014.

Anil Kumar Patidar , Jitendra Agrawal , Nishchol Mishra, “Analysis of different similarity measure functions and their impacts on shared nearest neighbor clustering approach,” International Journal of Computer Applications, vol. no.16, 2012.

Jasmine Irani, Nitin Pise and Madhura Phatak, “Clustering techniques and the similarity measures used in clustering: a survey,” International Journal of Computer Applications vol.134, no. 7, pp. 9-14, 2016.

J. Han and M. Kamber, Data mining: concepts and techniques, second edition, San Francisco, CA, USA: Morgan Kaufmann; Boston, MA,USA: Elsevier, 2006.

Anil K.Jain, “Data clustering: 50 years beyond k-means,” Pattern Recognition Letters, vol. 31, no.8, pp. 651-666, 2010.

Ali, Ameer M., Karmakar, Gour C. and Dooley, Laurence S., “Review on fuzzy clustering algorithms,”Journal of Advanced Computations, vol.2, no.3, pp. 169–181, 2008.

Yinghua Lu, TinghuaiMa, Changhong Yin, Xiaoyu Xie, Wei Tian and ShuiMing Zhong, “Implementation of the fuzzy c-means clustering algorithm in meteorological data,” International Journal of Database Theory and Application vol.6, no.6 , pp.1-18, 2013.

Fuyuan Cao , Jiye Liang , Deyu Li , Liang Baia , Chuangyin Dang , “A dissimilarity measure for the k-Modes clustering algorithm,”Knowledge-Based Systems, vol. 26, pp. 120-127, 2012.

Gaddam Saidi Reddy and R.V. Krishnaiah,“Clustering algorithm with a novel similarity measure,” IOSR Journal of Computer Engineering , vol. 4, no. 6, pp. 37-42, 2012.

Eduardo R. Hruschka, Ricardo J. G. B. Campello, Alex A. Freitas and André C. P. L. F. de Carvalho: "A survey of evolutionary algorithms for clustering," IEEE Transaction on Systems,Man and Cybernetics, PartC: Applications and Reviews.,vol.30, no.2, pp.133-155, 2009.

M.C.Naldi, R.J.G.B. Campello, E.R. Hruschka, A.C.P.L.F. Carvalho, “Efficiency issues of evolutionary k-means,”Applied Soft Computing vol.11, pp.1938-1952, 2011.

Nikhil R. Pal, Kuhu Pal, James M. Keller, and James C. Bezdek, “A Possibilistic fuzzy c-means clustering algorithm,” IEEE Transactions on Fuzzy Systems, vol.13, no. 4, pp.517-530, 2005.

Yung-Shen Lin, Jung-Yi Jiang and Shie-Jue Lee, “A Similarity measure for text classification and clustering,”IEEE Transactions on Knowledge and Data Engineering, 2014.

I. M. Hanafy, A. A. Salama, K. M. Mahfouz, “Correlation Coefficients of Neutrosophic Sets by Centroid Method,” International Journal of Probability and Statistics, vol. 2, no.1, pp.9-12. 2013.

V. P. Mahatme, K. K. Bhoyar, “Data Mining with Fuzzy Method Towards Intelligent Questions Categorization in E-Learning,” 8th International Conference on Computational Intelligence and Communication Networks, pp. 682-687, 2016.

V. P. Mahatme, K. K. Bhoyar, “Questions Categorization in E-Learning Environment using Data Mining Technique,”International Journal of Computer, Electrical, Automation, Control and Information Engineering, vol.10, no.1, pp.93-97, 2016.




DOI: https://doi.org/10.26483/ijarcs.v9i1.5417

Refbacks

  • There are currently no refbacks.




Copyright (c) 2018 International Journal of Advanced Research in Computer Science