TEXT DOCUMENT CLUSTERING USING ARTIFICIAL BEE COLONY WITH BISECTING K-MEANS ALGORITHM

Janani Balakumar, Dr. S. Vijayarani

Abstract


Recently, document clustering with optimization techniques has gained the attention of many researchers, especially those who are dealing with a huge volume of documents. The main goal of document clustering is to place the documents with similar content in one group, and the documents with dissimilar contents in another group. Document clustering with optimization algorithm achieves the global optimal solution. The main aim of this research work is to cluster the documents based on their content. In order to perform this task, this research work proposes a new hybrid algorithm called Artificial Bee Colony with Bisecting K-Means (ABC-BK). The proposed algorithm was verified with the benchmark dataset in contrast to the widely used document clustering algorithms. Experimental results show that the proposed algorithm gives a better performance compared to the standard ABC clustering algorithm, K-means, and the Bisecting K-means algorithm.

Keywords


Document Clustering, ABC Algorithm, K-means, Bisecting K-means, ABC-BK

Full Text:

PDF

References


Nicholas O. Andrews and Edward A. Fox, “Recent developments in document clustering,” Technical report published by citeseer, pp. 1-25, Oct. 2007

Jia, R., & Song, J. (2016). k-means optimal clustering number determination method based on clustering center optimization. Mcroelectronics and Computer, 5.

Ji, X., Han, Z., & Li, K., et al .(2016). Application of the improved K means clustering algorithm based on density in the division of distribution network. Journal of Shandong University (Engineering Science Edition), 4, 41-46.

D. Karaboga. An idea based on honey bee swarm for numerical optimization. Technical Report TR06, Erciyes University, Engineering Faculty, Computer Engineering Department, 2005.

B. Basturk and D. Karaboga. An artificial bee colony (abc) algorithm for numeric function optimization. In IEEE Swarm Intelligence Symposium 2006, Indianapolis, Indiana, USA, May 2006.

D. Karaboga and B. Basturk. A powerful and efficient algorithm for numerical function optimization: Artificial bee colony (abc) algorithm. Journal of Global Optimization, 39(3):459–471, 2007.

Berry, M. (ed.). "Survey of Text Mining: Clustering, Classification, and Retrieval". Springer, New York (2003)

Liu, G., Huang, T., & Chen, H. (2015). Improved bisecting K-means clustering algorithm. Computer Applications and Software, 2.

D. Karaboga and B. Basturk. On the performance of artificial bee colony (abc) algorithm. Applied Soft Computing, 8(1):687–697, 2008.

K. R. Zalik, “An efficient k-means clustering algorithm,” Pattern Recognition Letters, vol. 29, pp. 1385–1391, July 2008.

C. Zhang, D. Ouyang, and J. Ning, “An artificial bee colony approach for clustering,” Expert Systems with Applications, vol. 37, pp. 4761–4767, July 2010.

W. Zou, Y. Zhu, H. Chen, and X. Sui, “A clustering approach using cooperative artificial bee colony algorithm,” Discrete Dynamics in Nature and Society, vol. 2010, pp. 16, October 2010.

Pham, D.T. and Afify, A.A.: Clustering techniques and their applications in engineering. The Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science (2006)

Nihal M. AbdelHamid, M.B. AbdelHalim, M.W. Fakhr: Document clustering using Bees Algorithm. International Conference of Information Technology, IEEE, Indonesia (March 2013)

Goldberg D.E.: Genetic Algorithms-in Search, Optimization and Machine Learning. Addison- Wesley Publishing Company Inc., London (1989)

Rekha Baghel and Dr. Renu Dhir, “A Frequent Concepts Based Document Clustering Algorithm,” International Journal of Computer Applications, vol. 4, No.5, pp. 0975 – 8887, Jul. 2010

A. Huang, “Similarity measures for text document clustering,” In Proc. of the Sixth New Zealand Computer Science Research Student Conference NZCSRSC, pp. 49—56, 2008.




DOI: https://doi.org/10.26483/ijarcs.v9i1.5359

Refbacks

  • There are currently no refbacks.




Copyright (c) 2018 International Journal of Advanced Research in Computer Science