A SURVEY PAPER ON INFORMATION RETRIEVAL SYSTEM

Arpit Deo, Jayesh Gangrade, Shweta Gangrade

Abstract


Information retrieval is the process of obtaining and presenting more related information from the largest collection of information resources according to the user’s need. The tremendous growth in information resources on the Internet makes the information retrieval process a tedious and difficult task for users. Due to information overloading, there is a need for better techniques to retrieve most relevant information from web. This paper presents the information retrieval system by using the PSO algorithm. In presented system, to extract the text from web documents, all html tags are removed. After that stop words and special characters are removed from extracted text for recovering only meaningful contents. TF-IDF concept is used for feature selection. Now PSO optimization technique is used for identifying and refining the features set, these selected features are stored in a database which is used for information retrieval process. In other hand input query is converted into more than one similar semantic query strings. These query strings are compared with the obtained feature sets in the database by using the cosine similarity function. The most similar text is retrieved as an outcome of the information retrieval system.

Keywords


Information retrieval system; feature extraction; PSO optimization; similar query generator; similarity measure

Full Text:

PDF


DOI: https://doi.org/10.26483/ijarcs.v9i1.5505

Refbacks

  • There are currently no refbacks.




Copyright (c) 2018 International Journal of Advanced Research in Computer Science