A Mixed Generative-Discriminative Based Hashing Method
A Mixed Generative-Discriminative Based Hashing Method
ABSTRACT:
Hashing methods have proven to be useful for a variety of tasks and have attracted extensive attention in recent years. Various hashing approaches have been proposed to capture similarities between textual, visual, and cross-media information. However, most of the existing works use a bag-of-words methods to represent textual information. Since words with different forms may have similar meaning, semantic level text similarities can not be well processed in these methods. To address these challenges, in this paper, we propose a novel method called semantic cross-media hashing (SCMH), which uses continuous word representations to capture the textual similarity at the semantic level and use a deep belief network (DBN) to construct the correlation between different modalities. To demonstrate the effectiveness of the proposed method, we evaluate the proposed method on three commonly used cross-media data sets are used in this work. Experimental results show that the proposed method achieves significantly better performance than state-of-the-art approaches. Moreover, the efficiency of the proposed method is comparable to or better than that of some other hashing methods.
PROJECT OUTPUT VIDEO:
EXISTING SYSTEM:
-
Existing methods proposed to use Canonical Correlation Analysis (CCA), manifolds learning, dual-wing harmoniums, deep autoencoder, and deep Boltzmann machine to approach the task. Due to the efficiency of hashing-based methods, there also exists a rich line of work focusing the problem of mapping multi-modal high-dimensional data to low-dimensional hash codes, such as Latent semantic sparse hashing (LSSH), discriminative coupled dictionary hashing (DCDH), Cross-view Hashing (CVH), and so on.
-
Most of the existing works use a bag-of-words to model textual information.
DISADVANTAGES OF EXISTING SYSTEM:
-
Due to lack of sufficient training samples, relevance feedback of user was used to accurately refine cross-media similarities.
-
Not Textual and visual information was used earlier.
PROPOSED SYSTEM:
-
Motivated by the success of continuous space word representations (also called word embeddings) in a variety of tasks, in this work we propose to incorporate word embeddings to meet these challenges. Words in a text are embedded in a continuous space, which can be viewed as a Bag-of-Embedded-Words (BoEW).
-
Since the number of words in a text is dynamic, we proposed a method to aggregate it into a fixed length Fisher Vector (FV), using a Fisher kernel framework. However, the proposed method only focuses on textual information. Another challenge in this task is how to determine the correlation between multi-modal representations. Since we propose the use of a Fisher kernel framework to represent the textual information, we also use it to aggregate the SIFT descriptors of images.
-
Through the Fisher kernel framework, both textual and visual information is mapped to points in the gradient space of a Riemannian manifold. However, the relationships that exist between FVs of different modalities are usually highly non-linear. Hence, to construct the correlation between textual and visual modalities, we introduce a DBN based method to model the mapping function, which is used to convert abstract representations of different modalities from one to another.
ADVANTAGES OF PROPOSED SYSTEM:
-
The system proposes to incorporate continuous word representations to handle semantic textual similarities and adopted for cross-media retrieval.
-
Inspired by the advantages of DBN in handling highly non-linear relationships and noisy data, the system introduces a novel DBN based method to construct the correlation between different modalities.
-
A variety of experiments on three cross-media commonly used benchmarks demonstrate the effectiveness of the proposed method.
-
The experimental results show that the proposed method can significantly outperform the state-of-the-art methods.
MODULES:
- Image Hash Code Generation
- Text Hash Code Generation
- Mapping – hash code of Feature Vector
- Image Retrival
MODULE DESCRIPTIONS:
Image Hash Code Generation:
-
In this stage, first test image is acquired from gallery.
-
Key point extraction is performed in acquired test image using SIFT. Key points is represent the visual information of image.
-
After that principle component analysis is appiled for reduce the dimentionality. Then Mean, Standard deviation and Energy are caluclated for generate fixed length of feature vector.
-
Finally hash code genearted for fixed length of feature vector.
Text Hash Code Generation:
-
This is second stage of our method, in this using corresponding text for selected test image.
-
After word embedding is performed. Conversion of Word to Vector is called as Word Embedding. Then apply vector normalization for make vector in normal format. Normalizing refers to the process of making something “standard” or, well, “normal”.
-
Finally hash code generated for normalized feature vector.
Mapping- hash code of Feature Vector:
- In this stage, hash code of text and hash code of image are mapped.
- Combine two hash codes called as mapping.
Image Retrival:
-
Our final stage is image retrival.
-
In this stage, similar images are retrived by find similarity between train hash code of text-image and test hash code of text-image.
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
-
System : Pentium Dual Core.
-
Hard Disk : 120 GB.
-
Monitor : 15’’ LED
-
Input Devices : Keyboard, Mouse
-
Ram : 1GB.
SOFTWARE REQUIREMENTS:
-
Operating system : Windows 7.
-
Coding Language : MATLAB
-
Tool : MATLAB R2013A
REFERENCE:
Qi Zhang, Yang Wang, Jin Qian, and Xuanjing Huang, “A Mixed Generative-Discriminative Based Hashing Method”, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 28, NO. 4, APRIL 2016.