Diversity in Image Retrieval using Randomization and Learned Metrics
P Vidyadhar Rao (homepage)
Providing useful information to user requested queries is the most investigated problem in the multi-media retrieval community. The problem of information retrieval usually has many possible solutions, due to uncertainties in the user's information need and ambiguities in query specification. Some mechanism is required to evaluate the options and select a solution. This is quite a challenging task. In the recent years, the focus is gradually shifting towards relevance and diversity of retrieved information, which together improve the usefulness of retrieval system as perceived by users. Intuitively it is desirable to design a retrieval system with three requirements: a) Accurate retrieval i.e., the method should have high precision, b) Diverse retrieval, i.e., the obtained results should be diverse, c) Efficient retrieval, i.e., response time should be small. While considerable effort has been expended to develop algorithms which incorporate both relevance and diversity in the retrieval process, relatively less attention has been given to the problem of finding efficient diverse retrieval algorithms.
The main contribution of this thesis lies in developing efficient algorithms for the diverse retrieval problem. We show that the diverse retrieval problem can be mathematically defined as an integer convex optimization problem, and hence finding the optimal solution is NP-Hard. The existing approximate and greedy algorithms that try to find solution to this problem suffer from two drawbacks: a) Running time of the algorithms is very high as it is required to recover several exact nearest neighbors. b) Computations may require an unreasonably large amount of memory overhead for large datasets. In this work, we propose a simple approach to overcome all the above simultaneously based on two ideas: 1) Randomization and 2) Learned Metrics.
In the first case, the method is based on locality sensitive hashing and tries to address all of the above requirements simultaneously. We show that the effectiveness of our method depends on randomization in the design of the hash functions. Further, we derive a theoretically sound result to support the intuitiveness and reliability of using hash functions (via randomization) in the retrieval process to improve diversity. We modify the standard hash functions to take into account the distribution of the data for better performance. We also formulate the diverse multi-label prediction(of images and web pages) in this setting and demonstrate the scalability and diversity in the solution. We demonstrate effectiveness of our approach in three tasks: Image Category Retrieval, Multi-label Classification and Image Tagging. Our findings show that the proposed hash functions in combination with the existing diversity-based methods significantly outperforms standard methods without using hash functions. Our method allows to achieve a trade-off between accuracy and diversity using easy to tune parameters. We examine evaluation measures for diversity in several retrieval scenarios and introduce a new notion to simultaneously evaluate a method's performance for both the precision and diversity measures. Our proposal does not harm, but instead increases the reliability of the measures in terms of accuracy and diversity while ensuring $100x$-speed-up over the existing diverse retrieval approaches.
In the second case, the method is based on learning distance metrics. We show that effectiveness of our method depends on the learned distance metrics that suits the user's interest. In the case of instance based image retrieval methods, relevance and diversity are relative to users viewpoint of the camera, time of day, and camera zoom. We argue that the low-level image features fail to capture diversity with respect to high-level human semantics. We, therefore, use the high-level semantic information to learn metrics and re-fashion the visual feature space to appreciate diversity better in the retrieval. Our experiments confirm that our proposal is the best strategy from a learning perspective, when compared to original feature space, that the learned metrics provide better diversity in the retrieval.
In conclusion, in this thesis we discussed two fundamental ideas for retrieving diverse set of results. From the algorithmic and statistical perspective, the proposed method intuitively uses "randomness as resource" to improve diversity in retrieval while ensuring sub-linear retrieval time. From the visual perspectives, the proposed method utilizes user level semantics to "learn metrics" for improving diversity in instance based image retrieval. We believe that the ideas presented in this thesis are not limited to image retrieval and therefore, its applicability to different definitions of diversity (visual, temporal, spatial, and topical aspects), knowledge source combination (image and text), interactive retrieval systems (relevance feedback), and so forth are possible.
|Year of completion:
Vidyadhar Rao, Ajitesh Gupta, Visesh Chari, C. V. Jawahar - Learning Metrics for Diversity in Instance Retrieval Proceedings of the 5th National Confernece on Computer Vision, Pattern Recognition, Image Processing and Graphics, 16-19 Dec 2015, Patna, India. [PDF]
Vidyadhar Rao and C V Jawahar - Semi-Supervised Clustering by Selecting Informative Constraints Proceedings of 5th International Conference on Pattern Recognition and Machines Intelligence, 10-14 Dec. 2013, Kolkata, India. [PDF]