Performance Metrics for Selection of Quality Hidden Web Documents

Rashmi Agarwal, Niraj Singhal, Prem Sagar

Abstract


Hidden web continues to grow as organizations with large amount of high-quality information are placing their content online, providing web-accessible search facilities over existing databases. In particular, some extravagant web pages containing query search form, redundant data also retrieved while extracting content from hidden web. This paper addresses the issues related to selecting hidden web documents. It introduces a generic operational model for selection of quality hidden web documents. It also describes how this model helps in extracting quality hidden web documents and ignoring web pages which do not include form, downloads non-query forms and remove all the redundant query form within the same domain.


Keywords: Hidden web, quality documents, performance metrics, search forms, non-query form, submission efficiency, cost analysis.


Full Text:

PDF


DOI: https://doi.org/10.26483/ijarcs.v5i7.2318

Refbacks

  • There are currently no refbacks.




Copyright (c) 2016 International Journal of Advanced Research in Computer Science