A NOVEL TWO-PHASE PAGE FEATURE AND KTH KEYPHRASE FINGERPRINT BASED DUPLICATE DETECTION TECHNIQUE
Main Article Content
Abstract
Downloads
Article Details
COPYRIGHT
Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
- The journal allows the author(s) to retain publishing rights without restrictions.
- The journal allows the author(s) to hold the copyright without restrictions.
References
J Prasanna Kumar, P Govindarajulu ,“Duplicate and Near Duplicate Documents Detection: A Review†European Journal of Scientific Research ISSN 1450-216X Vol.32 No.4, pp.514-527,2009
Bassma S. Alsulami, Maysoon F. Abulkhair, Fathy E. Eassa, “Near Duplicate Document Detection Survey",International Journal of Computer Science & Communication Networks,Vol 2(2), 147-151,2010
I. S. Jacobs and C. P. Bean, “Fine particles, thin films and exchange anisotropy,†in Magnetism, vol. III, G. T. Rado and H. Suhl, Eds. New York: Academic, 1963, pp. 271–350.
Midhun Mathew, Shine N Das, T R Lakshmi Narayanan, Pramod K Vijayaraghavan, “A Novel Approach for Near-Duplicate Detection of Web Pages using TDW Matrixâ€, International Journal of Computer Applications (0975 – 8887)
A. Broder, S. Glassman, M. Manasse and G. Zweig, “Syntactic clustering of the webâ€, In Proc. of the 6th International World Wide Web Conference, Apr. 1997
Zahra Eskandari Gharghe, Behrouz Minaei Bidgoli,"Weighted shingling: an adaptation of shingling for weighted shingles",2009 IEEE
Junping Qiu and Qian Zeng, Detection and Optimized Disposal of NearDuplicate Pages, 2nd International Conference on Future Computer and Communication, Vol.2, pp: 604-607, 2010.
V.A. Narayana, P. Premchand and A. Govardhan, “Effective Detection of Near-Duplicate Web Documents in Web Crawlingâ€, International Journal of Computational Intelligence Research, Volume 5, Number 1, pp. 83–96, 2009.
Salha Alzahrani, Naomie Salim, “Fuzzy Semantic-Based String Similarity for Extrinsic Plagiarism Detection Lab Report for PAN at CLEFâ€, 2010
Chuan Xiao, Wei Wang, Xuemin Lin, Efficient Similarity Joins for Near Duplicate Detection, Proceeding of the 17th international conference on World Wide Web, pp 131 – 140. April 2008.
Yun Ling, Xiaobo Tao Hexin Lv, A Priority-Based Method Of Near duplicated Text Information Of Web Pages Deletion, IEEE International Conference on Software Engineering and Service Sciences (ICSESS), August 2010.
N.Joshi, J.Gagde, Near Duplicate Web Detection Using NDupDet Algorithm, International Journal of Computer Applications , Volume 61, No.4, Jan2013
Fetterly, D., Manasse, M. and Najork, M. On the evolution of clusters of near duplicate web pages, In Proceedings of the first Latin American Web Congress (LAWeb), 37–45, 2003.