Techniques For Efficient Short Text Understanding: A Survey on Related Literature

Main Article Content

Geena Jojy
Reshmi R

Abstract

The trend of social media and various online applications has rapidly increased over the past few years. These computer-mediated communications has resulted in the generation of large amount of short texts. A short text refers to the text with limited contextual information. Lots of interest lies in analyzing and conceptualizing short text for understanding user intents from search queries or mining social media messages. Consequently, the task of understanding short text is crucial to many online applications. But it is not ease to handle enormous volume of short texts, since they are relatively more ambiguous and noisy than normal data. The short texts do not follow the syntax of natural language. Thus, point out the necessity for an efficient text understanding technique. The task of short text understanding or conceptualization can be divided into three, as text segmentation, type detection, and concept labeling. In text segmentation, initially the input text is processed and removes all the stop words if any. Then it is divided into a sequence of terms. POS tagging decide the lexical types (i.e. POS tags) of terms in a text. Type detection is incorporated into the framework for short text understanding and it help to conduct disambiguation based on various types of contextual information that present in the text. Finally, concept labeling is performed to discover the hidden semantics from a natural language text. The conceptualization can benefit from various online applications such as automatic question-answering, recommendation systems, online advertising, and search engines. All these applications requires an information extraction phase in which the prior step is to extract the concepts from the input text. Now-a-days conceptualization is used to develop machine learning techniques for information extraction. Hence the task of conceptualization or short text understanding plays a vital role in the area of machine learning, which is an active area of research. In this paper, the current techniques used for text segmentation, type detection, and concept labeling are reviewed. Keywords: Short text understanding; conceptualization; semantic labeling; text segmentation; part-of-speech tagging

Downloads

Download data is not yet available.

Article Details

Section
Articles