Ontology Based Information Retrieval Using Vector Space Model
Main Article Content
Abstract
Information retrieval (IR) is the science of searching for documents, for information within documents and for metadata about documents, as well as that of searching relational databases and the World Wide Web. In this paper, after a brief review on ranking models, a new ontology based approach for ranking HTML/TXT documents is proposed and evaluated in various circumstances. Our approach is applying the vector space model method. Increasing growth of information volume in the internet causes an increasing need to develop new semi) automatic methods for retrieval of documents and ranking them according to their relevance to the user query. This combination reserves the precision of ranking without losing the speed. Our approach exploits natural language processing techniques for extracting phrases and stemming words. The annotated documents and the expanded query will be processed to compute the relevance degree exploiting statistical methods. The outstanding features of our approach are (1) combining HTML, TXT, PDF documents, (2) finding frequency of each and every word, (3) removing stop keywords, (4) applying porter stemming algorithm, to remove the suffix of every word and (5) allowing input variable document using vector dimensions. A ranking system called Information Retrieval using Vector Space Model (IRVSM) is developed to implement and test the proposed model.
Keywords: Ontology, Parsing, Indexing, Stemming, Vector Space Model, Document Ranking.
Downloads
Article Details
COPYRIGHT
Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
- The journal allows the author(s) to retain publishing rights without restrictions.
- The journal allows the author(s) to hold the copyright without restrictions.