Next Level Approach of Data Deduplication in the Era of Big Data

Shamsher Singh, Ravinder Singh


Word DATABASE itself have a strong and deep meaning which indicates that high amount of raw facts and figures are stored at a place in electronic form. Data gives chances to its owner organization to compete in challenging world and stand aside from the crowd. But to store high amount of data it requires high amount of storage space. Almost 55% data stored in computer memory present in duplicated form which costs very high to organization to manage it. In this paper we proposed a new strategy to locate similar data stored on disks in big data environment. As the result duplicated data will be removed from the storage media and free up the space, increase the system performance in terms of operational speed, and reduce the time for deduplication process.


Backup, Big Data, Data Deduplication, Data Node, Name Node

Full Text:




  • There are currently no refbacks.

Copyright (c) 2017 International Journal of Advanced Research in Computer Science