CVIT Home CVIT Home
  • Home
  • People
    • Faculty
    • Staff
    • PhD Students
    • MS Students
    • Alumni
    • Post-doctoral
    • Honours Student
  • Research
    • Publications
    • Thesis
    • Projects
    • Resources
  • Events
    • Talks and Visits
    • Major Events
    • Visitors
    • Summer Schools
  • Gallery
  • News & Updates
    • News
    • Blog
    • Newsletter
    • Banners
  • Contact Us
  • Login

Secure Biometric Authentication with Fixed-Length Binary Representations.


Rohan Kulkarni (homepage)

Biometrics have been established to be extremely reliable at the task of identifying individuals and thus are at the core of several real-world systems ranging from employee attendance to access control systems in the military. With growing computing resources available to individuals, biometric authentication systems are being deployed in even wider range of commercial applications. The permanent nature of biometrics raises serious security concerns with these deployments. Also, losing one's biometric trait can compromise that individual's identity in all the systems he is enrolled in. Biometrics are of non-rigid nature, requiring a fuzzy matching process, thus making it difficult to directly borrow popular security techniques used elsewhere with passwords and key-cards. Thus, the research interest received by this field attempts to develop efficient and reliable biometric authentication systems while addressing the issues of security and privacy.

Binary biometric representations have been shown to provide significant improvement in efficiency without compromising the system performance for various modalities including fingerprints, palmprints and iris. Hence, this thesis is focused on developing secure and privacy preserving protocols for fixed-length binary biometric templates which use hamming distance as the dissimilarity measure. We propose a novel authentication protocol using a \textit{somewhat} homomorphic encryption scheme that provides template protection and ability to use masks while computing the hamming distance. The protocol operates on encrypted data, providing complete biometric privacy to individuals trying to authenticate, only revealing the final matching score to the server. It allows real-time authentication and retains matching accuracy of the underlying representation as demonstrated by our experiments on iris and palmprints.

We also propose a one-time biometric token based authentication protocol for widely used banking transactions. In the current scenario, the user is forced to trust the service provider with his sole banking credentials or credit card details for availing desired services. Often used one-time password based systems do provide additional transaction security, however the organizations using such systems are still incapable of differentiating between a genuine user trying to authenticate or an adversary with stolen credentials. Involving biometric security would certainly strengthen the authentication process. The proposed protocol upholds the requirements of secure authentication, template protection and revocability while providing user anonymity from the service provider. We demonstrate our system's security and performance using iris biometrics to authenticate individuals.

 

Year of completion:  December 2014
 Advisor : Anoop. M. Namboodiri

Related Publications

  • Rohan Kulkarni, Anoop M. Namboodiri - One-Time Biometric Token based Authentication Proceedings of the Ninth Indian Conference on Computer Vision, Graphics and Image Processing, 14-17 Dec 2014, Bangalore, India. [PDF]

  • Rohan Kulkarni and Anoop M Namboodiri - Secure Hamming Distrance based Biometric Authentication Proceedings of the 6th IAPR International Conference on Biometrics, 04-07 June 2013, Madrid, Spain. [PDF]


Downloads

thesis

 ppt

Word Recognition of Indic Scripts


Naveen TS (homepage)

Optical Character Recognition (OCR) problems are often formulated as isolated character (symbol) classification task followed by a post-classification stage (which contains modules like UNICODE generation, error correction etc. ) to generate the textual representation, for most of the Indian scripts. Such approaches are prone to failures due to (i) difficulties in designing reliable word-to-symbol segmentation module that can robustly work in presence of degraded (cut/fused) images and (ii) converting the outputs of the classifiers to a valid sequence of UNICODES. In this work, we look at two important aspects of word recognition -- word image to text string conversion and error detection and correction in words represented as UNICODES. In this thesis, we propose a formulation, where the expectations on the two critical modules of a traditional OCR (i.e, segmentation and isolated character recognition) is minimized. And the harder recognition task is modelled as learning of an appropriate sequence to sequence transcription scheme. We thus formulate the recognition as a direct transcription problem. Given many examples of feature sequences and their corresponding UNICODE representations, our objective is to learn a mapping which can convert a word directly into a UNICODE sequence. This formulation has multiple practical advantages: (i) This reduces the number of classes significantly for the Indian scripts. (ii) It removes the need for a word-to-symbol segmentation. (ii) It does not require strong annotation of symbols to design the classifiers, and (iii) It directly generates a valid sequence of UNICODES. We test our method on more than 5000 pages of printed documents for multiple languages. We design a script independent, segmentation free architecture which works well for 7 Indian scripts. Our method is compared against other state-of-the-art OCR systems and evaluated using a large corpora.

Second contribution of this thesis is in investigating the possibility of error detection and correction in highly inflectional languages. We take Malayalam and Telugu as the examples. Error detection in OCR output using dictionaries and statistical language models (SLMs) have become common practice for some time now, while designing post-processors. Multiple strategies have been used successfully in English to achieve this. However, this has not yet translated towards improving error detection performance in many inflectional languages, especially Indian languages. Challenges such as large unique word list, lack of linguistic resources, lack of reliable language models, etc. are some of the reasons for this. In this thesis, we investigate the major challenges in developing error detection techniques for highly inflectional Indian languages. We compare and contrast several attributes of English with inflectional languages such as Telugu and Malayalam. We make observations by analysing statistics computed from popular corpora and relate these observations to the error detection schemes. We propose a method which can detect errors for Telugu and Malayalam, with an F-Score comparable to some of the less inflectional languages like Hindi. Our method learns from the error patterns and SLMs.

 

Year of completion:  January 2014
 Advisor : Prof. C. V. Jawahar

 


Related Publications

  • Praveen Krishnan, Naveen Sankaran, Ajeet Kumar Singh and C. V. Jawahar - Towards a Robust OCR System for Indic Scripts Proceedings of the 11th IAPR International Workshop on Document Analysis Systems, 7-10 April 2014, Tours-Loire Valley, France. [PDF]

  • Naveen Sankaran, Aman Neelappa and C V Jawahar - Devanagari Text Recognition: A Transcription Based Formulation Proceedings of the 12th International Conference on Document Analysis and Recognition, 25-28 Aug. 2013, Washington DC, USA. [PDF]

  • Naveen Sankaran and C V Jawahar - Error Detection in Highly Inflectional Languages Proceedings of the 12th International Conference on Document Analysis and Recognition, 25-28 Aug. 2013, Washington DC, USA. [PDF]

  • Naveen Sankarana, C V Jawahar - Recognition of Printed Devanagari Text Using BLSTM Neural Network Proceedings of 21st International Conference on Pattern Recognition, 11-15 Nov. 2012, pp.322-325Vol. 21 ISBN 978-4-9906441-1-6, Japan. [PDF]

 


Downloads

 

thesis

 ppt

Modeling Scene Text and Texture by Decomposing into Component Images


Siddharth Kherada (homepage)

Separation of images into its constituent components based on source of data, frequency distribution, or nature of data has been a widely used technique in the field of image processing and computer vision. Many problems are solved by partitioning images into components and working on each component separately. Common examples include breaking down of image into Red, Green and Blue channels/components for ease of representation or into Luminance and Chrominance for better compression. In this thesis, we explore the separation of natural images into appropriate components for the purpose of representation as well as recognition. We first introduce a framework where separation of images into direct and global components helps in modeling of 3D textures. These 3D textures are often described by parametric functions for each pixel, that models the variation in its appearance with respect to varying lighting direction. However, parametric models such as Polynomial Texture Maps(PTMs) tend to smoothen the changes in appearance. Therefore we propose a technique to effectively model natural material surfaces and their interactions with changing light conditions. We show that the direct and global components of an image have different characteristics, and when modeled separately, leads to a more accurate and compact model of the 3D surface texture. Direct component is mainly affected by structural properties of the surface and is therefore deals with phenomena like shadows and specularity, which are sharply varying functions. The global component is used to model overall luminance and color values, a smoothly varying function. For a given lighting position, both components are computed separately and combined to render a new image. This method models sharp shadows and specularities, while preserving the structural relief and surface color. Thus rendered image have enhanced photorealism as compared to images rendered by existing single pixel models such as PTMs.

We then look at separating an image based on its sources of illumination or albedo variations for the purpose of scene text segmentation. Extracting text from scene images is a challenging task due to the variations in color, size, and font of the text and the results are often affected by complex backgrounds, different lighting conditions, shadows and reflections. A robust solution to this problem can significantly enhance the accuracy of scene text recognition algorithms leading to a variety of applications such as scene understanding, automatic localization and navigation, and image retrieval. We propose a method to extract and binarize text from images that contains complex background. We use Independent Component Analysis (ICA) to map out the text region, which is inherently uniform in nature, while removing shadows, specularity and reflections, which are included in the background. The technique identifies the text regions from the components extracted by ICA using a simple global thresholding method to isolate the foreground text. We show the results of our algorithm on some of the most complex word images from the ICDAR 2003 Robust Word Recognition Dataset and compare with previously reported methods.

 

Year of completion:  December 2014
 Advisor : Anoop M. Namboodiri

 


Related Publications

  • Siddharth Kherada and Anoop M Namboodiri - An ICA based Approach for Complex Color Scene Text Binarization Proceedings of the 2nd Asian Conference Pattern Recognition, 05-08 Nov. 2013, Okinawa, Japan. [PDF]

  • Siddharth Kherada, Prateek Pandey, Anoop M. Namboodiri - Improving Realism of 3D Texture using Component Based Modeling Proceedings of IEEE Workshop on Applications of Computer Vision 9-11 Jan. 2012, ISSN 1550-5790 E-ISBN 978-1-4673-0232-6, Print ISBN 978-1-4673-0233-3, pp. 41-47, Breckenridge, CO, USA. [PDF]


Downloads

thesis

ppt

 

Skyline Segmentation Using Shape-constrained MRFS


Rashmi Tone Vilas (homepage)

MRF energy minimization has been used for image segmentation in a wide range of applications. Standard MRF energy minimization techniques are computationally expensive. Besides, incorporating higher order priors such as shape and parameters related to it is either very complex or computationally expensive or requires prior information such as shape location. Furthermore, semantic understanding is not achieved using pure MRF formulation, i.e. information about the structure of a skyline such as depth cannot be known through output. Standard semantic segmentation methods using geometric context information is restricted to very few geometric classes or the ones which exploit specific “tiered” structure is computationally exponential in number of labels.

Our aim is to extract the detailed structure of a skyline, i.e. individual buildings and their depth. In this case, there is no restriction on the number of labels. The problem is challenging due to numerous reasons such as complex occlusion patterns, large number of labels and intra-region color and texture variations, etc. We propose an approach for segmenting the individual buildings in typical skylines. Our approach is based on a Markov Random Field (MRF) formulation that exploits the fact that such images contain overlapping objects of similar shapes exhibiting a “tiered” structure. Our contributions are the following:

  • We introduce a dataset Skyline-12 consisting of 120 skyline images from the 12 cities all over the world. All the images are manually annotated with addition of meta-data like initial boundaries and seeds.
  • We include an analysis and integration of low-level features such as color, texture and shape very useful for the segmentation of skylines.
  • We propose a fast, accurate and robust method to extract individual buildings of a skyline exploit- ing “tiered” structure of a skylines and incorporating rectangular shape prior in MRF formulation.

For simple shapes such as rectangles, our formulation is significantly faster to optimize than a standard MRF approach, while also being more accurate. We experimentally evaluate various MRF formulations and demonstrate the effectiveness of our approach in segmenting skyline images.

We propose both Interactive and Automatic methods for segmenting skylines. While interctive set- ting gives an accurate output and a fast approach to segment skylines given input seeds from user, automatic setting provides about 25% improvement over state-of-art low level automatic segmentation methods. Our approach can be generalized to different shapes as well as detailed structure of a skyline can be used in many applications such as 3D reconstruction of a skyline from single image.

 

Year of completion:  January 2015 
 Advisor : Prof. C. V. Jawahar

Related Publications

     


    Downloads

     thesis

     ppt

    Enhancing Bag of Words Image Representations


    Vinay Garg (Homepage)

    Bag of visual words inspired from text classification has been used extensively for solving various computer vision tasks such as image classification, image retrieval, image recognition, etc. In text classification the vocabulary set is a fixed finite set of words present in a particular language, which is not the case in visual domain. There is no fixed set of visual words in visual domain. Infact there is no concept of words in images. This is due to the fact that complexity in visual domain is so high as even a small change like rotation, translation, change of view angle, lightning, etc. will have a huge impact on the information perceived by the machines for these images. Even though for us these changes will not make any difference but for machines, these all will be different images. So to overcome this problem, vision community has defined the concept of visual words, which are analogous to textual words. But the visual words are not very well defined due to vast domain of the visual data as compared to textual data which have finite number of words. Using these visual words we create the image representations as the frequency of these visual words in that image and in turn use these representations to do various vision tasks.

    In this thesis we aim at improving these image representations, as the accuracy and performance of various vision models depends directly on quality of image representations given to them as input. We started with the traditional bag of visual words, study various practical issues and drawbacks in that approach, tried refining one of the various steps of pipeline at a time. Doing so we devised novel strategies to overcome some of the issues which we faced while studying the traditional approaches. In the approaches which we applied to solve the issues, we used various parameters which needed fine tuning and we have discussed the effect of each parameter in detail with the empirical results to support our hypothesis and finally conclude that our representations were better as compared to the various traditional approaches presently used.

    To solve the problem of information loss due to hard assignments in traditional bag of words, we analyzed various soft assignment techniques. On replacing the hard assignments with soft assignments, we found that classification results improved drastically. Even while comparing different soft assignment techniques among themselves, we found that absolute soft assignments are better as compared to relative soft assignments. We demonstrate the superiority of our approaches on various popular datasets.

    Recently vision community showed that Fisher vector image representations outperform Bag of words representations. This boost in performance is because, firstly Fisher vectors use soft assignments and secondly, they reduce the information loss by capturing the deviations of each visual feature from the mean. However, like any other approach they also have their share of drawbacks. Size of Fisher vector image representations is huge and they are not inherently discriminative. So, we introduced sparseness to reduce the effective size of the representations, and added some class information to make them discriminative. These additions reduces the high storage requirements, but at the same time adding class information also increases the performance. To demonstrate these findings, we tested it on various datasets which supported our claim.

    Driven from the hypothesis that improving individual steps of various image representations pipelines will improve the final image representation, we have tried various techniques to refine these steps, which in turn will improve the performance of our model. After improving the final step of creating image representations from the visual words, we tried improving the set of visual words (or vocabulary) itself. We found that most of the visual words which we use for building the image representations are not actually useful and there are a lot of redundant words present in the visual vocabulary. So, to further improve our representations, we devised a novel technique to combine various visual words from different types of vocabularies (which will capture different type of information from a given set of images), and combine the best of them to get the final global vocabulary, which will be used to get the final image representations. Again, we used benchmark datasets to demonstrate that our hypothesis is correct.

    In this thesis, we have tried our hands to solve the classification task in a better way. Although we are not able to solve the research tasks perfectly (i.e. reaching the perfect score), but we hope that our findings will atleast give a starting point for various new directions which will lead us to our ultimate goal of replicating the human vision.

     

    Year of completion:  May 2015 
     Advisor : Prof. C. V. Jawahar

                           


    Related Publications

    • Vinay Garg, Siddhartha Chandra, C V Jawahar - Sparse Discriminative Fisher Vectors in Visual Classification Proceedings of the 8th Indian Conference on Vision, Graphics and Image Processing, 16-19 Dec. 2012, Bombay, India. [PDF]

    • Vinay Garg, Sreekanth Vempati and C.V. Jawahar - Bag of visual words: A soft clustering based exposition Proceedings of 3rd National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics, ISBN 978-0-7695-4599-8, pp.37-40 15-17 Dec. 2011, Hubli, India. [PDF]

     


    Downloads

      thesis

    ppt

    • Start
    • Prev
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • Next
    • End
    1. You are here:  
    2. Home
    3. Research
    4. Thesis
    5. Thesis Students
    Bootstrap is a front-end framework of Twitter, Inc. Code licensed under MIT License. Font Awesome font licensed under SIL OFL 1.1.