Thesis Students

Repetition Detection and Shape Reconstruction in Relief Images.

Harshit Agrawal (homepage)

Relief carving is very popular sculpting technique that is being used for decoration and depicting stories and scenes from ancient times till today. With time, many of the ancient cultural heritage artifacts are getting damaged and one of the important methods that aids preservation and study is to capture them digitally.

RepetitionDetection Reliefs carvings have certain specific attributes that makes them different from regular sculptures, which can be exploited in different computer vision tasks. Repetitive patterns are one such frequently occurring phenomenon in reliefs. Algorithms for detection of repeating patterns in images often assume that the repetition is regular and highly similar across the instances. Approximate repetitions are also of interest in many domains such as hand carved sculptures, wall decorations, groups of natural objects, etc. Detection of such repetitive structures can help in applications such as image retrieval, image inpainting and 3D reconstruction. In this work, we look at a specific class of approximate repetitions: those in images of hand carved relief structures. We present a robust hierarchical method for detecting such repetitions. Given a single relief panel image, our algorithm finds dense matches of local features across the image at various scales. The matching features are then grouped based on their geometric configuration to find repeating elements. We also propose a method to group the repeating elements to segment the repetitive patterns in an image. In relief images, foreground and background have nearly the same texture, and matching of a single feature would not provide reliable evidence of repetition. Our grouping algorithm integrates evidences of repetition to reliably find repeating patterns. Input image is processed on a scale-space pyramid to effectively detect all possible repetitions at different scales. Our method has been tested on images with large varieties of complex repetitive patterns and the qualitative results show the robustness of our approach.Point-based rendering suffer from the limited resolution of the fixed number of samples representing the model. At some distance, the screen space resolution is high relative to the point samples, which causes under-sampling. A better way of rendering a model is to re-sample the surface during the rendering at the desired resolution in object space, guaranteeing a sampling density sufficient for image resolution. Output sensitive sampling samples objects at a resolution that matches the expected resolution of the output image. This is crucial for hole-free point-based rendering. Many technical issues related to point-based graphics boil down to reconstruction and re-sampling. A point based representation should be as small as possible while conveying the shape well.

ShapeReconstruction ReliefImage Reconstructing geometric models of relief carvings are also of great importance in preserving her- itage artifacts, digitally. In case of reliefs, using laser scanners and structured lighting techniques is not always feasible or are very expensive given the uncontrolled environment. Single image shape from shading is an underconstrained problem that tries to solve for the surface normals given the intensity image. Various constraints are used to make the problem tractable. To avoid the uncontrolled lighting, we use a pair of images with and without the flash and compute an image under a known illumination. This image is used as an input to the shape reconstruction algorithms. We present techniques that try to reconstruct the shape from relief images using the prior information learned from examples. We learn the variations in geometric shape corresponding to image appearances under different lighting conditions using sparse representations. Given a new image, we estimate the most appropriate shape that will result in the given appearance under the specified lighting conditions. We integrate the prior with the normals computed from reflectance equation in a MAP framework. We test our approach on relief images and compare them with the state-of-the-art shape from shading algorithms. (more...)

Year of completion:	July 2014
Advisor :	Prof. C. V. Jawahar

Related Publications

Harshit Agrawal and Anoop M Namboodiri - Shape Recostruction from a Single Relief Image Proceedings of the 2nd Asian Conference Pattern Recognition (ACPR), 05-08 Nov. 2013, Okinawa, Japan. [PDF]

Harshit Agrawal, Anoop M. Namboodiri - Detection and Segmentation of Approximate Repetitive Patterns in Relief Images IEEE Eighth Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP 2012), December, 2012 [Project]

Downloads

Solving Decomposition Problems in Computer Vision using Linear Optimization

Ankit Gandhi (homepage)

The images can be considered as a union of many parts or a composition of multiple segments. Some of the examples of such parts/segments can be different objects present in the image, foreground and background regions, or the textual (regions containing text) and non-textual regions in the image. In order to get the inherent semantics and higher level knowledge associated with the images, getting insight of such parts/segments is essential. In this thesis, we introduce the notion of “decomposition” in images. The decomposition refers to the phenomenon of break down of images into its constituents meaningful parts/segments. Those meaningful parts depends on the task we are interested in. In case of foreground and background decomposition, it can be a pixel accurate segmentation of a foreground or the tight rectangular box enclosing the foreground segment. In this work, we have discussed the problem of decomposition in two kinds of images – natural images and document images. We also show how popular computer vision tasks such as object detection, semantic segmentation, document layout analysis, word spotting, etc. can be perceived as a decomposition task.

In this thesis, we solve two decomposition problems in a linear optimization framework. Firstly, decomposing a global histogram of a natural image into histograms of its associated objects and regions and secondly, decomposing a questioned document image into regions containing copied and non-copied/original contents in a recognition free setting.

The decomposition of a global histogram representation of an image into histograms of its associated objects and regions is formulated as an optimization problem, given a set of linear classifiers, which can effectively discriminate the object categories present in the image. This decomposition bypasses harder problems associated with localization and the explicit pixel-level segmentation. Our decomposition framework is also applicable for separating histograms of object and background in an image. Our solution is computationally efficient and we demonstrate its utility in multiple situations. We evaluate our method on a wide variety of composite histograms and also compare it with MRF-based solutions. We decompose histograms at an average accuracy of 86.4% on a Caltech-256 based dataset. In addition to merely measuring the accuracy of decomposition, we also show the utility of the estimated object and background histograms for the task of image classification on PASCAL VOC 2007 dataset.

To solve the problem of decomposition in questioned document images, we detect documents from the database which have exact or similar text to a given query document or region image. Exact duplicate is the direct cut and paste of content from multiple documents in the database whereas near duplicate document segments (similar text) could arise due to various document manipulations like summarization, copying, rewriting, editing, formatting, cut-and-paste, etc. We refer to the corresponding problems as retrieval of exact and near duplicate document images. We formulate the problem as a document retrieval task, and solve it in a recognition-free setting. We propose two approaches which are capable of detecting regions generated by these operations accurately without depending on a reliable OCR. First approach is based on modelling the solution as finding a mixture of homographies, and designing a linear programming (LP) based solution to compute the same while the second approach is based on learning a discriminative classifier for a questioned document region to retrieve duplicate documents. Using both the approaches, we get encouraging results. (more...)

Year of completion:	July 2014
Advisor :	Prof. C. V. Jawahar

Related Publications

Ankit Gandhi, Karteek Alahari and C V Jawahar - Decomposing Bag of Words Histograms Proceedings of International Conference on Computer Vision (ICCV), 1-8th Dec.2013, Sydney, Australia. [PDF]
Ankit Gandhi and C V Jawahar - Detection of Cut-And-Paste in Document Images Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), 25-28 Aug. 2013, Washington DC, USA. [PDF]

Downloads

Learning Semantic Interaction Among Indoor Objects

Swagatika Panda (homepage)

Robot manipulation in clutter with objects in physical contact remains a challenging problem till date. The challenge is posed by interaction involved among the objects at various levels of complexity. Understanding positional semantics of the environment plays an important role in such tasks. The interaction with surrounding objects in the environment must be considered in order to perform the task without causing the objects fall or get damaged. In our work, we learn the semantics in terms of support relationship among different objects in a cluttered environment by utilizing various photometric and geometric properties of the scene. To manipulate an object of interest, we use the inferred support relationship to derive a sequence in which its surrounding objects should be removed while causing minimal damage to the environment. We believe, this work can push the boundary of robotic applications in grasping, object manipulation and picking-from-bin, towards objects of generic shape and size and scenarios with physical contact and overlap.

In the first part of the thesis, we aim at learning semantic interaction among objects of generic shapes and sizes lying in clutter involving both direct and indirect physical contact. Three types of support relationships are inferred: "Support from below", "Support from side", and "Containment". Subsequently, the learned semantic interaction or support relationship is used to derive a sequence or order in which the objects surrounding the object of interest should be removed without causing damage to the environment. The generated sequence is called Support Order. We have proposed and analysed two alternative approaches for support inference. In the first approach "Multiple Object Support Inference", support relations between all possible pairs are inferred. In the second approach "Hierarchical Support Inference", given an object of interest, its support relationship with other graspable objects is inferred hierarchically. The support relationship is used to predict the "support order" or the order in which the surrounding objects need to be removed in order to manipulate the target object.

In the second part of the thesis, we attempt to learn the semantic interaction among different objects in clutter using multiple views. At first, support relationship among objects in each view is inferred. Then the inferred support relationships are combined to define support relationships across multiple views. The combined global support relationship is used to recover missing support relations and predict the support order. Support order is the order in which objects surrounding an object of interest should be removed. The support order predicted using global support relationship incorporates hidden objects and missing spatial support relations.

We have created two RGBD datasets consisting of various objects used in day-to-day life present in clutter. In "Indoor dataset for clutter", 50 cluttered scenes are captured from frontal view using 35 objects of different shapes and sizes. In "Indoor multiview dataset", 7 cluttered scene are captured. Each scene each captured from multiple views. In this dataset, total 67 images are captured using 9 objects of different shapes and sizes. The dataset is made publicly available for the research community around the world. We explore many different settings involving different kind of object-object interaction. We successfully learn support relationships and predict support order in these settings. It can play significant role in extending the scope of manipulation to cluttered environment involving both direct and indirect physical contact, and generic objects.

Keywords: Robotic Vision, Support Relations, Support Order, RGBD, Semantic Interaction, Clutter, Multiple Views

(more...)

Year of completion:	July 2014
Advisor :	Prof. C. V. Jawahar

Related Publications

Swagatika Panda, A.H. Abdul Hafez and C. V. Jawahar - Learning Semantic Interaction among Graspable Objects Proceedings of 5th International Conference on Pattern Recognition and Machines Intelligence (PReMI), 10-14 Dec. 2013, Kolkata, India. [PDF]
Swagatika Panda, A.H. Abdul Hafez and C V Jawahar - Learning Support Order for Manipulation in Clutter Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), 03-08 Nov. 2013, Tokyo, Japan. [PDF]

Downloads

Fingerprint Image Enhancement Using Unsupervised Hierarchical Feature Learning

Mihir Sahasrabudhe

The use of fingerprints is an important method for identification of individuals in today's world. They are also one of the most reliable biometric traits, besides the iris. Fingerprint recognition refers to various tasks that are associated with fingerprint identification, verification, feature extraction, indexing and classification. There are a lot of systems, in a variety of domains, that employ fingerprint recognition. That being a given, precision in fingerprint recognition is essential.

Identification using fingerprints is done using a feature extraction step, followed by matching of these features. The features that are extracted to be matched depend on the algorithm being used for identification. In large databases of fingerprints, like government records, fingerprints might be indexed before they are matched. This significantly reduces the time required to identify an individual from records, as comparing his/her prints with every entry in the database will take a enormous amount of time. In either case, feature extraction plays an important role. However, feature extraction is affected directly by the quality of the input image. A noisy or unclear fingerprint image might affect the extraction of features strongly. To counter noise in input images, an extra step of enhancement is introduced before feature extraction and matching are performed. The goal of extraction is to improve quality of ridges and valleys in the fingerprint by making them clearly distinguishable, but in the process, also preserve information. The enhancement algorithm should not only not omit or remove existing information from the fingerprint, but also not introduce any spurious features that were not present in the original image.

The considerable research into fingerprint recognition, and in fingerprint enhancement, has contributed to the large number of existing algorithms for image enhancement. These include pixel-wise enhancement, contextual filtering, and Fourier analysis, to name a few. Contextual filtering uses specific filters to convolve the input image with. These parameters are determined by the values of certain features at every pixel in the input image. This requires extraction of pre-defined features from the fingerprint. For instance, the filter used at a point is affected by the ridge orientation at that point. Hence, to decide the filter to be used, the ridge orientation at that point needs to be extracted. A similar case can be observed in other types of algorithms too.

In this thesis, we propose that unsupervised feature learning be applied to the fingerprint enhancement problem. We use two different scenarios and models to show that unsupervised feature learning indeed helps improve an existing algorithm, and also when applied directly to greyscale images, can complete with robust contextual filtering and Fourier analysis algorithms. Our motivation lies in the fact that there is vast amount of available data on fingerprints; and with the recent advent in deep learning and unsupervised feature learning, in particular, we can use this available data to learn structures in fingerprint images.

For the first model, we show that continuous restricted Boltzmann machines can be used to learn local fingerprint orientation field patterns, after which their learning can be used to correct noisy, local ridge orientations. This extra step is introduced between orientation field estimation and contextual filtering. We show that this step improves the performance of matching done on the enhanced images. In the second model, we use a 3-layered convolutional deep belief network to learn features directly from greyscale fingerprint images. We show that having a deep neural network (3 layers) significantly improves the quantitative and qualitative performance of the enhancement. The deep network helps in predicting noisy regions, that were otherwise not reconstructed by the first layer only.

In conclusion, we have explored a new direction to attack the fingerprint enhancement problem. We conjecture that it is possible to extend this work to other problems involving fingerprint recognition too. For instance, synthetic fingerprint generation might be accomplished using the convolutional deep belief network trained on fingerprint features. Our experiments show a several potential experiments for the future which can give promising results. (more...)

Year of completion:	December 2014
Advisor :	Anoop. M. Namboodiri

Related Publications

Mihir Sahasrabudhe, Anoop M. Namboodiri - Fingerprint Enhancement using Unsupervised Hierarchical Feature Learning Proceedings of the Ninth Indian Conference on Computer Vision, Graphics and Image Processing, 14-17 Dec 2014, Bangalore, India. [PDF]
Mihir Sahasrabudhe and Anoop M Namboodiri - Learning Fingerprint Orientation Fields Using Continuous Restricted Boltzmann Machines Proceedings of the 2nd Asian Conference Pattern Recognition (ACPR), 05-08 Nov. 2013, Okinawa, Japan. [PDF]

Downloads

Secure Biometric Authentication with Fixed-Length Binary Representations.

Rohan Kulkarni (homepage)

Biometrics have been established to be extremely reliable at the task of identifying individuals and thus are at the core of several real-world systems ranging from employee attendance to access control systems in the military. With growing computing resources available to individuals, biometric authentication systems are being deployed in even wider range of commercial applications. The permanent nature of biometrics raises serious security concerns with these deployments. Also, losing one's biometric trait can compromise that individual's identity in all the systems he is enrolled in. Biometrics are of non-rigid nature, requiring a fuzzy matching process, thus making it difficult to directly borrow popular security techniques used elsewhere with passwords and key-cards. Thus, the research interest received by this field attempts to develop efficient and reliable biometric authentication systems while addressing the issues of security and privacy.

Binary biometric representations have been shown to provide significant improvement in efficiency without compromising the system performance for various modalities including fingerprints, palmprints and iris. Hence, this thesis is focused on developing secure and privacy preserving protocols for fixed-length binary biometric templates which use hamming distance as the dissimilarity measure. We propose a novel authentication protocol using a \textit{somewhat} homomorphic encryption scheme that provides template protection and ability to use masks while computing the hamming distance. The protocol operates on encrypted data, providing complete biometric privacy to individuals trying to authenticate, only revealing the final matching score to the server. It allows real-time authentication and retains matching accuracy of the underlying representation as demonstrated by our experiments on iris and palmprints.

We also propose a one-time biometric token based authentication protocol for widely used banking transactions. In the current scenario, the user is forced to trust the service provider with his sole banking credentials or credit card details for availing desired services. Often used one-time password based systems do provide additional transaction security, however the organizations using such systems are still incapable of differentiating between a genuine user trying to authenticate or an adversary with stolen credentials. Involving biometric security would certainly strengthen the authentication process. The proposed protocol upholds the requirements of secure authentication, template protection and revocability while providing user anonymity from the service provider. We demonstrate our system's security and performance using iris biometrics to authenticate individuals.

Year of completion:	December 2014
Advisor :	Anoop. M. Namboodiri

Related Publications

Rohan Kulkarni, Anoop M. Namboodiri - One-Time Biometric Token based Authentication Proceedings of the Ninth Indian Conference on Computer Vision, Graphics and Image Processing, 14-17 Dec 2014, Bangalore, India. [PDF]
Rohan Kulkarni and Anoop M Namboodiri - Secure Hamming Distrance based Biometric Authentication Proceedings of the 6th IAPR International Conference on Biometrics, 04-07 June 2013, Madrid, Spain. [PDF]

Repetition Detection and Shape Reconstruction in Relief Images.

Related Publications

Downloads

Solving Decomposition Problems in Computer Vision using Linear Optimization

Related Publications

Downloads

Learning Semantic Interaction Among Indoor Objects

Related Publications

Downloads

Fingerprint Image Enhancement Using Unsupervised Hierarchical Feature Learning

Related Publications

Downloads

Secure Biometric Authentication with Fixed-Length Binary Representations.

Related Publications

Downloads

More Articles …