CONTENT BASED MULTISPECTRAL SATELLITE IMAGE RETRIEVAL SYSTEM USING HYBRID MODEL

: The Earth Observatory data volume is increasing day by day. To store the satellite data and retrieved the relevant results is becoming a big task. Content Based image Retrieval (CBIR) is a technique to retrieve the relevant image from the database. The traditional method used for the image retrieval was TBIR; the images were annotated based on the text. The results obtained using the TBIR was not so accurate, so it raised the need of the CBIR. The images are retrieved based on the color, shape and texture feature of the images. These features are compared with the query image features and the results are obtained by the similarity measure technique. In this paper, for similarity measure Euclidean distance measure is used. Using the precision and recall the relevance of the results are computed.


I. INTRODUCTION
Due to the easy access of the internet, the data is increasing tremendously from the past years. The visual data is enhancing a lot. To store and to retrieve such a wide range of data is becoming a challenge in this Era. Because of this, the image processing became a more interesting and challenging aspect. To store and to retrieve the images efficiently and effectively is all about image retrieval system. Work on image retrieval started from 1970 [11]. Earlier the retrieval of the images was based on keyword annotation method. The keyword was assigned to an image and based on that the images were retrieved. However, manual processing became time consuming, as well as the results were not so efficient. Manual processing became impractical because every user has different perception of any object. So, annotating those images and retrieving the results may not be accurate one. The problems regarding the typing error may occur, or the user perceives the image differently and their annotation is different. The National Science Foundation of the United States organized a workshop on Visual Information Management Systems to identify new directions in image database management systems in 1992. It was widely recognized that the visual content would be helpful for the problems associated. The concept of Content Based Image Retrieval came into existence in the workshop.
Content Based Image Retrieval (CBIR) is a technique to retrieve images on the basis of visual content such as color, shapes and textures. [15] "With the help of single content let's say color we would not be able to get the optimized result because of the similarity of the color of different objects. For this along with the color, the texture or shape or both should be combined to retrieve similar result. Color, textures, shapes are still low level feature and they should be used along with the high level features like text annotation for the optimized results". The content of the image is analyzed by extracting the features such as colors, shapes, textures and histogram descriptor [9]. The accuracy of the retrieved results can be computed by "Precision and recall or by Length of String to Recover All Relevant Images i.e. LSRR" [3]. The low level extraction techniques were giving the better results but the problem they faced was of semantic gap. It was difficult to convert the user need for the image in a complete manner to a Content Based Image Retrieval (CBIR). Due to this problem, the images retrieved would not be more effective and efficient. [15] Through exhaustive research, CBIR (Content Based Image Retrieval) came into existence in 1992 and since then many systems have been developed for Content Based Image Retrieval for uses in commercial fields [14]. Some of these systems are briefly mentioned in the following paragraph. This field of image retrieval is attracting substantial number of researchers, however till now the use of Content Based Image Retrieval is not very common and still in infancy. Many big organization's search engine such as Yahoo Image Search and Google Image Search still depends heavily on the metadata [4]. Due to which their performance is not very efficient, particularly when we come for the complicated searches such as a particular region within the images.

II. RELATED WORK
CBIR system came into the existence from 90s and till now the extensive work is going on in this area. Since the inception of CBIR system the following best known systems were developed such as Query by Image Content (QBIC) and Photobook and its new version Four-Eyes. Other well-known systems are the search engine family Visual-SEEk, Meta-SEEk and Web-SEEk], NETRA, Multimedia Analysis and Retrieval System (MARS) proposed a new method, Intelligent Query (IQ) to get the images from the database [10]. The images were obtained on the basis of the semantic method rather than the metadata. The sketch created by the human was provided as the query image and similar images to that of a sketch were extracted from the database. The extracted images were ranked according to their similarity. If the result obtained was not so accurate than the sketch was again modified and the same process was repeated. [15] The results shown by the IQ were very accurate, but it could have been more effective if the images were retrieved with some more semantic features. One of the researcher have used the domain-dependent concept for the image retrieval [7].Researchers have proposed a system to generate the mosaic of the system, which is called Mosaicture. The system is implemented using the color feature of the database image, using the color histogram values followed by binary signature. In this work all the features are not addressed. So accuracy of the system is not so high [17].
The images are retrieved based on the concept of Fuzzy-Neural hybrid system. A Feed Forward Back Propagation Neural Network (FNN) was used for the image classification. The images were retrieved using the low level feature extraction. The attempt was made to show the maximum number of details of an image as provided by its histogram. For reorganization, Fuzzy Logic Approach (FLA) and Artificial Neural Network Approach (ANN) were used. They established that the histogram of Discrete Cosine coefficients, rather than the conventional intensity histogram, is a better measure for image details [22].
The obtained result was accurate, but the study was done on the basis of the low level feature extraction, but it would have been better if the high level feature extraction have been used along with the low level. The obtained results would have been more effective if the combination of the semantic feature have been used. [15].A researcher have proposed a system to generate the mosaic of the system, which is called Mosaicture. The system is implemented using the color feature of the database image, using the color histogram values followed by binary signature. [13] The researchers have retrieved the satellite images on the basis of the region, using the Motif Co-occurrence Matrix (MCM) in conjunction with spatial relationships. The image was decomposed into segments and then the MCM was computed for each region. [5] Web enabled application was developed by the researcher, in which the user can find the images by the matched result of the query image. The domain-dependent concept for the image retrieval was used using ontology approach. The user queries raised were based on the satellite images attributes such as sensor type, the time of the image captured, etc. The experiment was performed on the LANDSAT TM imagery for the water body. The result was prioritized using the prioritizing algorithm. [15] Figure 1. Interface for ontology building tool [7] So far, the images were retrieved using the low level and the high level feature extraction method. But the author have used measurements for the image retrieval. The three measurements are the interval scale, ordinal scale, and ratio scale. The method used to find the performance of semantic features with different measurement scales was Average Normalized Modified Retrieval Rank (ANMRR). The results showed that ordinal measurement scale was effective for image retrieval among the three measurement scales. [21] Researchers have used the 3 LISS III + multi-spectral satellite images with 23.5m resolution. The following techniques were used for the image extraction and matching : the Color Moment was used for the Feature Extraction and GLCM was used for Texture Feature Extraction followed by K-means clustering to form index and then the images were retrieved using the query image and the images stored in the database. The entropy was small for the GLCM method used for textually uniform images [6].
The experiment was performed on images obtained by LISS III sensor and multi-spectral images, the wide variety of images were not used.  Figure 2(b). CBIR system [6] The researchers have tried to bridge the semantic gap using the semantic category, such as field, water and vegetation for image retrieval. In this paper the two LISS III images in addition with multi-spectral satellite images with 23.5m resolution have been used. The similarity measures have been performed by the color moments, GLSM, NDVI and using the combined features. The performance was validated by using the Precision & Recall [6].
The images are extracted with low level and along with the high level feature extraction techniques. The two types of satellite images were taken, one was of the urbanized area and another one was of rural area. The color based content processing technique and the histogram technique was applied on the images. It was found that the color based content processing technique was useful for the urbanized area, however not for the rural area because there is no variation in the features in the rural area. So, for the rural area histogram provided the better result [8] . Researchers have decomposed the images based on spatial and spectral heterogeneity. The features were extracted based on the visual feature, spatial relationship, semantic, scene semantic and object semantic basis. Then the images were retrieved based on the mapping and the SS modeling. In the end the computed result was validated with the traditional SBRSIR technique. The performance was evaluated using the methods like precision, recall and F-score. The result showed that the Remote Sensing Image Retrieval System (RSIR) has provided more precise images as compared to the conventional SBRSIR technique [20]. One of the researchers has retrieved the images using the texture and color analysis, Four different approaches of color texture are used for the image classification from the VisTex database. The results show that RGB provides better classification results [12]. Researcher has worked related to Oceanography. The objective was the retrieval of the mesoscale structures in the oceans through CBIR technique. The requirement for the study has been raised due to the occurrence of the natural hazards in the oceans. Through the detection of the similar images the researchers would be able to detect the hazard prone area in the ocean. The classifications of the input images were done using the neurofuzzy logic. The comparative study of the classification The comparative study of the classification techniques was done and found that the runtime executions for the classification using neorfuzzy and image retrieval of regions of interest was less than 0.001 sec. The images were gathered from the Advanced Very High Resolution Radiometer (AVHRR) sensors from the National Oceanic and Atmospheric Administration (NOAA) was classified. The study was performed on the images acquired from the single sensor. The results would have been more effective if it would have been validated by the different sensors [18]. A system was designed using texture feature for high resolution satellite images. They have used a local binary feature and a block based scheme for the image retrieval. The obtained results were very accurate. But the high resolution satellite images contained the texture as well as the structure. If structural features would have been included then the results could have been much more accurate [19].

III. PROPOSED METHOLODGY
The steps followed in the experiment is shown in the Figure4, it has been divided into the sections such the pre-processing of the satellite image, then the creation of feature vector using the feature extraction techniques, all low level techniques is used to extract the results. Then the hybrid model is applied to get the more relevant results and using the similarity measures results are computed. The performance is measured by the precision and recall metric.   The projection of input , output and shapefile must been same. • Identify the AOI and mask the region. • Apply the mask on the shapefile to get the desired region data.

A. Feature Extraction
The retrieval of the images from the database, is dependent on the two major factors i.e. the extraction of the features and creation of feature vector and the similarity measure technique. . For the proposed result Euclidean distance is used for the similarity measure. For feature extraction all low level descriptor technique is applied, discussed below [1].

a. Color Feature Extraction
Color is the most important method used for feature extraction of image. As humans differentiates the things very easily on the basis of the color. Color is a property which depends on the process of reflection, in which the light fall on the eye and reflect the value, that information is processed in the brain. Many methods are used to extract the color feature of an image, those methods are explained below:-• Color Histogram The most important method of representing color feature of an image is color histogram. . A color histogram consists of a set of bins, where each bin donates the pixel value that is the probability of occurrence of a particular color. The color histogram CH for an image is defined as:

CH= {CH [0], CH[4],…..CH[i]…….CH[N]}
Where, i is the color in that color space, CH[i] is the number of pixels in color i in that image, and N is the number of bins or the number of colors in that image. We can show a color histogram as a two dimensional graph, where bar in the xaxis represents the bin and the y-axis correspondence to the number of pixel in that bin. There are two type of color histogram: -Global Color Histogram (GCH) and Local Color Histogram (LCH). Among the two, the LCH contains more information of an image, but cost wise it is not much feasible to use when comparing with images. In case of GCH, a single color histogram represents a whole image.

• Color Moments
Color moments are the moments which work good when our image contains just the object. In the probability distribution of colors, the statistical moments shown are the color moments, which are used in the retrieval systems. The parameters which are calculated in this method are: Variance, Mean and Skewness. It is an effective and efficient method for showing color distribution but it lacks behind in showing the spatial information surrounding the color within the image. The feature vector is formed by Mean which is first order descriptor, Variance the second order and the Skewness the third order. They are defined as follows: • Color Coorelogram This method is used for, encoding the color information of an image. It solves the problem associated to the image representation because it holds the spatial data in the encoded color information form. It has many advantage too, it represents the global distribution of local spatial correlation of colors and is simple to compute. The color coorelogram shows the color correlation distribution in an image. In case of color histogram, the image is divided into four bins, and then in coorelogram, that divided four bins are further subdivided into four bins , and for every such division the maxima of frequencies is calculated. For pixel

• Color Coherence Vector
This method provides better retrieval results than the other color methods because it partitions the histogram according to the spatial coherence of an image. Each pixel in an image is divided into two types i.e. coherent and incoherent. Depending on the region of the uniformity color. We get separate histogram for both types, which include the spatial information in the feature vector [16]. Because of this reason we get better results. The color coherent is calculated by the formula: Where, P and N are the number of coherent pixel and number of incoherent pixel simultaneously. And Pi and Ni is the numbers of coherent and incoherent pixels in color i simultaneously.

b. Shape Feature Extraction
Shape is a parameter which separates its object from the surrounding. Shape can be divided into two categories i.e. Region Based and Boundary Based. The region base uses the internal characteristics of an image where as the boundary base uses the outer boundary of the shape. Boundary base defines the external characteristic of an image by its pixel value contained in that region.

c. Texture Feature Extraction
Texture is an important feature extraction technique for the images [2].
• GLCM Greylevel Co-Occurrence Matrix (GLCM) is an extensively used texture descriptor. It counts the occurrence of single grey level feature with reference to the spatial relationship of another feature. In a tabular manner it represents the combination of pixel brightness value in an image. The main advantage of GLCM is that it calculates the pair of the pixel with reference to the distance and the angular spatial relationship with the other pixel at the same instance of time . As a result the combination of grey levels and their positions are shown apparently. Therefore it is defined as -"A two dimensional histogram of grey levels for pair of pixels, which are separated by a fixed spatial relationship". Then GLCM is obtained by the output of the second order statistics, the following statistical parameters are applied to get result as: X (i,j│d,θ) = Energy = Here, energy computes the number of repeated pair, where the reputation is high, energy will be high.

Homogeneity =
Homogeneity measures the closeness of the GLCM to GLCM diagonal.

Contrast =
It computes the difference in the pixel values, if the difference between the pixel values is less the contrast will be low.

Correlation =
It computes the correlation between the pixels of the pixel pair. If the pixels pair are not correlated to each other, then the correlation will be low.

d. Hybrid Approach
In the hybrid approach the color, shape and texture based feature extraction is done, in a hybrid manner. The feature extraction of a satellite image based on the single method will not give the desired results. So, the hybrid approach was tested.
IV RESULTS AND DISCUSSION All color, shape, texture algorithm is implemented in MATLAB. Color, shape, texture are the low level feature extraction techniques, which gives the result but does not bridge the semantic gap. To get the much accurate results, Hybrid approach is applied, which will combine the feature vector value of all the descriptors. It is clear from the obtained results that Hybrid approach has given the better precision and recall value as compared to low level descriptors individually. Fig. 5, 6, 7 shows the ranked result obtained from the low level feature extraction technique and Fig. 8 shows the ranked images obtained by the Hybrid approach. Table 3 shows the precision and recall obtained from all the method applied. We have compared the results of our proposed work with the existing system results. Table 4, shows the precision obtained from our technique is higher than the existing systems result. . It proves that the hybrid approach yield effective and efficient retrieval.

V CONCLUSION
In the recent paper the experiment have been performed on the low level feature extraction of the image. Using the color, shape and texture. Among all these techniques the texture has given the better precision and recall value. In CBIR extracting features based on low level feature extraction technique does not bridge the semantic gap. So, for that the Hybrid approach is applied on the dataset and the results obtained were better. Further different approaches of color, shape and texture would be done for much better results.