Document representation techniques and their effect on the document Clustering and Classification: A Review

Main Article Content

Ksh Nareshkumar Singh
H. Mamata Devi
Anjana Kakoti Mahanta

Abstract

Text data is the most common form of storing information. When engine search an query, user obtained the large collection of text data. All this retrieve text data are not relevant to the required information. So, it needs to organise the massive amount of text data. Analysing and processing the text data is mainly considered in text mining. Text mining uses the standard data mining methods- classification and clustering. These two methods are used to arrange the documents which are usually represented by hundreds or thousands of texts (words) data. Text data in the document can be represented in various representation methods. In this paper, we have presented a study of various research paper that explore the area of text mining including different document representation methods and their impact on clustering and classification results.

Downloads

Download data is not yet available.

Article Details

Section
Articles