DBSCAN BASED SEED INITIALIZATION OF K-MEANS ALGORITHM

Main Article Content

sameer koul

Abstract

This paper proposes effective approach to overcome the problem of finding initial number of clusters for Supervised Data mining algorithms. We present critical review of various approaches that finds the optimal number of clusters for clustering algorithms. In this paper we have used Dbscan algorithms to obtain initial seeds for basic k-means algorithm. To evaluate the proposed approache we have used iris data set, liver disorder dataset and seed dataset.

Downloads

Download data is not yet available.

Article Details

Section
Articles
Author Biography

sameer koul, Islamia college of science and commerce

assistant proffesor

References

Bradley, P. S., & Fayyad, U. Refining initial points for k-means clustering, In:Proc. of the 15th int. conf. on machine learning (pp. 91–99),1998.

Likas, A., Vlassis, N., & Verbeek, J.. “The global k-means clustering algorithm,†2003.

Pattern Recognition, 36(2), 451–461.S.S. Khan, A. Ahmad, “Cluster center initialization algorithm for K -Means clustering,†Patter Recognition Letters 25 1293–1302,2004.

Su, T., & Dy, J. G. “In search of deterministic methods for initializing k-means,†(2007).

Lu, J. F., Tang, J. B., Tang, Z. M., & Yang, J. Y. “Hierarchical initialization approach for k-means clustering,†Pattern Recognition Letters, 29(6), 787–795,2008.

Onoda, T., Sakai, M., & Yamada, S. Careful seeding method based on independent components analysis for k-means clustering. Journal of Emerging Technologies in Web Intelligence, 4(1), 51–59,2012.

K. Mumtaz et al. / (IJCSE) International Journal on Computer Science and Engineering,†A Novel Density based improved k-means Clustering Algorithm – Dbkmeans,â€ISSN : 0975-3397 213 Vol. 02, No. 02, 2010, 213-218.

Kalpana D. Joshi et al,†Modified K-Means for Better Initial Cluster Centres,â€International Journal of Computer Science and Mobile Computing Vol.2 Issue. 7, July- 2013, pg. 219-223.

Y.K. Lam, P.W.M. Tsang.â€eXploratory K-Means: A new simple and efficient algorithm for gene clustering†/ Applied Soft Computing 12 (2012) 1149–1157.

R. Tibshirani, G. Walther, and T. Hastie. “Estimating the number of data clusters via the gap statisticâ€. Journal of the Royal Statistical Society B, 63:411{423, 2001.

C. A. Sugar and G. M. James. “Finding the number of clusters in a dataset: an information-theoretic approachâ€.Journal of the American Statistical Association,98:750{763, 2003.

Peter J. Rousseuaw (1987). "Silhouettes: a Graphical Aid to the Interpretation and Validation of Cluster Analysis". Computational and AppliedMathematics 20:53–65.