Filtered Indexing: An Alternate Indexing Mechanism for De-Duplication

Main Article Content

L.Mary Immaculate sheela
S. Preethi

Abstract

Years ago internet was merely used to retrieve information. Later on paradigm shifted and internet came a long way of providing different types of services to the users. Today cloud computing has become the buzz word. Cloud describes a new supplement, consumption and a delivery model for internet based services. Cloud further offers virtualized services over the internet. When the data is spread wide across the cloud, duplication becomes inevitable. It is virtually impossible to eliminate duplication. At present, there is a vast amount of duplicated data or redundant data in storage systems. Data de-duplication can eliminate multiple copies of the same file and duplicated segments or chunks of data within those files. Instead we can avoid redundancy through data de-duplication. Data de-duplication is a technique where in the redundant data is deleted keeping only the unique copy of the data. Current issue for data de duplication is to avoid full-chunk indexing to identify the incoming data is new, which is time consuming process.Thereby improving storage utilization. In current scenario Full chunk indexing is a major issue over the cloud. In this paper we propose an efficient indexing mechanism using the filtered index databases. In this paper first we divide the variable length chunks using the sliding window. Then each chunk is given a chunk ID using a hash function. The disk storage is much less than that required by a table and search time is much reduced with the use of filtered index databases.

 

 

Keywords: Cloud computing, Data De-duplication, Full Chunk Indexing, Filtered index.

Downloads

Download data is not yet available.

Article Details

Section
Articles