Personalized Search Engine from User Click through Log
Personalized Search Engine from User Click through Log
ABSTRACT:
Image reranking is effective for improving the performance of a text-based image search. However, existing reranking algorithms are limited for two main reasons: 1) the textual meta-data associated with images is often mismatched with their actual visual content and 2) the extracted visual features do not accurately describe the semantic similarities between images. Recently, user click information has been used in image reranking, because clicks have been shown to more accurately describe the relevance of retrieved images to search queries. However, a critical problem for click-based methods is the lack of click data, since only a small number of web images have actually been clicked on by users. Therefore, we aim to solve this problem by predicting image clicks. We propose a multimodal hypergraph learning-based sparse coding method for image click prediction, and apply the obtained click data to the reranking of images. We adopt a hypergraph to build a group of manifolds, which explore the complementarity of different features through a group of weights. Unlike a graph that has an edge between two vertices, a hyperedge in a hypergraph connects a set of vertices, and helps preserve the local smoothness of the constructed sparse codes. An alternating optimization procedure is then performed, and the weights of different modalities and the sparse codes are simultaneously obtained. Finally, a voting strategy is used to describe the predicted click as a binary event (click or no click), from the images’ corresponding sparse codes. Thorough empirical studies on a large-scale database including nearly 330K images demonstrate the effectiveness of our approach for click prediction when compared with several other methods. Additional image re-ranking experiments on realworld data show the use of click prediction is beneficial to improving the performance of prominent graph-based image re-ranking algorithms.
PROJECT OUTPUT VIDEO:
EXISTING SYSTEM:
Most existing re-ranking methods use a tool known as pseudo-relevance feedback (PRF), where a proportion of the top-ranked images are assumed to be relevant, and subsequently used to build a model for re-ranking. This is in contrast to relevance feedback, where users explicitly provide feedback by labeling the top results as positive or negative. In the classification-based PRF method, the top-ranked images are regarded as pseudo-positive, and low-ranked images regarded as pseudo-negative examples to train a classifier, and then re-rank. Hsu et al. also adopt this pseudo-positive and pseudo-negative image method to develop a clustering-based re-ranking algorithm.
DISADVANTAGES OF EXISTING SYSTEM:
- One major problem impacting performance is the mismatches between the actual content of image and the textual data on the web page.
- The problem with these methods is the reliability of the obtained pseudo-positive and pseudo-negative images is not guaranteed.
PROPOSED SYSTEM:
In this paper we propose a novel method named multimodal hypergraph learning-based sparse coding for click prediction, and apply the predicted clicks to re-rank web images. Both strategies of early and late fusion of multiple features are used in this method through three main steps.
- We construct a web image base with associated clickannotation, collected from a commercial search engine. The search engine has recorded clicks for each image. Indicate that the images with high clicks are strongly relevant to the queries, while present non-relevant images with zero clicks. These two components form the image bases.
- We consider both early and late fusion in the proposed objective function. The early fusion is realized by directly concatenating multiple visual features, and is applied in the sparse coding term. Late fusion is accomplished in the manifold learning term. For web images without clicks,we implement hypergraph learning to construct a groupof manifolds, which preserves local smoothness using hyperedges. Unlike a graph that has an edge between two vertices,a set of vertices are connected by the hyperedge in a hypergraph. Common graph-based learning methods usually only consider the pairwise relationship between two vertices, ignoring the higher-order relationship among three or more vertices.Using this term can help the proposed method preserve the local smoothness of the constructed sparse codes.
- Finally, an alternating optimization procedure is conducted to explore the complementary nature of different modalities. The weights of different modalities and the sparse codes are simultaneously obtained using this optimization strategy. A voting strategy is then adopted to predict if an input image will be clicked or not, based on its sparse code.
ADVANTAGES OF PROPOSED SYSTEM:
- We effectively utilize search engine derived images annotated with clicks, and successfully predict the clicks for new input images without clicks. Based on the obtained clicks, we re-rank the images, a strategy which could be beneficial for improving commercial image searching.
- Second, we propose a novel method named multimodal hypergraph learning-based sparse coding. This method uses both early and late fusion in multimodal learning. By simultaneously learning the sparse codes and the weights of different hypergraphs, the performance of sparse coding performs significantly.
MODULES:
- Annotation Process
- Concatenating Technique
- Optimization Process
MODULES DESCRIPTION:
Annotation Process:
First, we construct a web image base with associated clickannotation, collected from a commercial search engine. Asshown in Fig. 1, the search engine has recorded clicks foreach image. Fig. 1(a), (b), (e), and (f) indicate that the imageswith high clicks are strongly relevant to the queries, whileFig. 1(c), (d), (g), and (h) present non-relevant images withzero clicks. These two components form the image bases.
We effectively utilize search engine derived imagesannotated with clicks, and successfully predict the clicksfor new input images without clicks. Based on theobtained clicks, we re-rank the images, a strategy whichcould be beneficial for improving commercial imagesearching.
Concatenating Technique:
Second, we propose a novel method named multimodalhypergraph learning-based sparse coding. This methoduses both early and late fusion in multimodal learning.By simultaneously learning the sparse codes and theweights of different hypergraphs, the performance ofsparse coding performs significantly.We consider both early and late fusion in theproposed objective function. The early fusion is realized bydirectly concatenating multiple visual features, and is appliedin the sparse coding term. Late fusion is accomplished inthe manifold learning term. For web images without clicks,we implement hypergraph learningto construct a groupof manifolds, which preserves local smoothness using hyper-edges. Unlike a graph that has an edge between two vertices,a set of vertices are connected by the hyperedge in a hyper-graph. Common graph-based learning methods usually onlyconsider the pairwise relationship between two vertices, ignoring the higher-order relationship among three or more vertices.Using this term can help the proposed method preserve thelocal smoothness of the constructed sparse codes.
Optimization Process:
Finally, an alternating optimization procedure is conductedto explore the complementary nature of different modalities.The weights of different modalities and the sparse codesare simultaneously obtained using this optimization strategy.A voting strategy is then adopted to predict if an inputimage will be clicked or not, based on its sparse code.The obtained click is then integrated within a graph-basedlearning framework to achieve image re-ranking.
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
- System : Pentium IV 2.4 GHz.
- Hard Disk : 40 GB.
- Floppy Drive : 44 Mb.
- Monitor : 15 VGA Colour.
- Mouse : Logitech
- Ram : 512 Mb.
SOFTWARE REQUIREMENTS:
- Operating system : Windows XP/7.
- Coding Language : net, C#.net
- Tool : Visual Studio 2010
- Database : SQL SERVER 2008