A Noval Approach for Text Classification using Self Clustering Algorithm
Abstract
In a document, the high dimensionality of text has not been fruitful for the task of categorization, for which, feature clustering has been
proven for text categorization problems and to reduce the dimensionality of feature vectors. In this paper, we propose a Self Clustering Algorithm
(SC) for feature clustering in which the number of extracted Features is obtained automatically. In this method words are represented as distributions
and processed one by one sequentially. Words with specific similar feature are clustered together. A new cluster is created for a word which is not
similar to any existing cluster. Each and Every Cluster is characterized by a membership function with statistical mean and deviation. Once all the
words have been fed in, a desired number of clusters are formed, having an extracted feature. Besides the user need not specify the number of
extracted features in advance and trial -and - error for determining the appropriate number of extracted features can be avoided. Evaluation results for
these tasks show that the proposed methodology obtains reliable performance for text classification tasks.
Keywords: Feature Clustering, Text Classification, Self Clustering
Full Text:
PDFDOI: https://doi.org/10.26483/ijarcs.v4i2.1524
Refbacks
- There are currently no refbacks.
Copyright (c) 2016 International Journal of Advanced Research in Computer Science

