Interactive visualization and tuning of multi-dimensional clusters for indexing

Dasari Pavan Kumar (homepage)

The automation of activities in all areas, including business, engineering, science, and government, produces an ever-increasing stream of data. Especially, the amount of multimedia content produced and made available on the Internet, both in professional and personal collections is growing rapidly. Equally increasing are the needs in terms of efficient and effective ways to manage it. And why is that so? Because, people believe that data collected contains valuable information. But, extracting any such information/patterns is however an extremely difficult task. This has led to a great amount of research into content based retrieval and visual recognition. The most recent retrieval systems available extract low-level image features and conceptualize them into clusters. A conventional sequential scan on those image features would approximately take about a few hours to search in a set of hundreds of images. Hence, clustering and indexing forms the very crux of the solution. The state of the art uses the 128-dimensional SIFT as low level descriptors. Indexing even a moderate collection involves several millions of such vectors. The search performance depends on the quality of indexing and there is often a need to interactively tune the process for better accuracy. In this thesis, we propose a visualization-based framework and a tool which adheres to the it to tune the indexing process for images and videos. We use a feature selection approach to improve the clustering of SIFT vectors. Users can visualize the quality of clusters and interactively control the importance of individual or groups of feature dimensions easily. The results of the process can be visualized quickly and the process can be repeated. The user can use a filter or a wrapper model in our tool. We use input sampling, GPU-based processing, and visual tools to analyze correlations to provide interactivity. We present results of tuning the indexing for a few standard datasets. A few tuning iterations resulted in an improvement of over 5% in the final classification performance, which is significant. (more...)


Year of completion:  2012
 Advisor : P. J. Narayanan

Related Publications

    • Dasari Pavan Kumar and P. J. Narayanan - Interactive Visualization and Tuning of SIFT Indexing in Proceedings of the Vision, Modelling and Visualization Workshop 2010, Siegen, Germany, 97-105, Eurographics Association., 2010.