CVIT Home CVIT Home
  • Home
  • People
    • Faculty
    • Staff
    • PhD Students
    • MS Students
    • Alumni
    • Post-doctoral
    • Honours Student
  • Research
    • Publications
    • Journals
    • Books
    • MS Thesis
    • PhD Thesis
    • Projects
    • Resources
  • Events
    • Talks and Visits
    • Major Events
    • Visitors
    • Summer Schools
  • Gallery
  • News & Updates
    • News
    • Blog
    • Newsletter
    • Past Announcements
  • Contact Us
  • Login

Towards Understanding Texture Processing


Gopal Datt Joshi (homepage)

A fundamental goal of texture research is to develop automated computational methods for retrieving visual information and understanding image content based on textural properties in images. A synergy between biological and computer vision research in low-level vision can give substantial insights about the processes for extracting color, edge, motion, and spatial frequency information from images. In this thesis, we seek to understand the texture processing that takes place in low level human vision in order to develop new and effective methods for texture analysis in computer vision. The different representations formed by the early stages of HVS and visual computations carried out by them to handle various texture patterns is of interest. Such information is needed to identify the mechanisms that can be use in texture analysis tasks. We examine two types of cells, namely the bar and grating cells, which have been identified in literature to play an important role in texture processing, and develop functional models for the same. The model for the bar cell is based on the notion of surround inhibition and excitation. Whereas, the model for the grating cell is based on the fact that a grating cell receives direct inputs from the M-type ganglion cells. The representations derived by these cells are used to design solutions to two important problems of texture: texture based segmentation and classification. The former is addressed in the domain of natural image understanding and the latter is addressed in the domain of document image understanding. Based on our work, we conclude that the early stages of HVS effectively represent various texture patterns and also provide ample information to solve the higher level texture analysis tasks. The richness of information emerges from the capability of the HVS to extract global visual primitives from local features. The presented work is an initial attempt to integrate the current knowledge of HVS mechanisms and computational theories developed for texture analysis. (more...) 

 

Year of completion:
December 2006
 Advisor :

Jayanthi Sivaswamy


Related Publications

  • Joshi Datt Joshi, Saurabh Garg and Jayanthi Sivaswamy - Script Identification from Indian Documents, Proceedings of IAPR Workshop on Document Analysis Systems (DAS 2006), Nelson, pp.255-267. [PDF]

  • Gopal Datt Joshi, Saurabh Garg and Jayanthi Sivaswamy - A Generalised Framework for Script Identification Proc. of International Journal for Document Analysis and Recognition(IJDAR), 10(2), pp.55-68, 2007. [PDF]

  • Gopal Datt Joshi and Jayanthi Sivaswamy - A Computational Model for Boundary Detection, 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.172-183, 2006. [PDF]

  • Gopal Datt Joshi, and Jayanthi Sivaswamy - A Simple Scheme for Contour Detection, Proceedings of International Conference on Computer Vision and Applications (VISAP 2006), Setubal. [PDF]

  • Gopal Datt Joshi , and Jayanthi Sivaswamy - A Multiscale Approach to Contour Detection, Proceedings of International Conference on Cognition and Recognition ,pp. 183-193, Mysore, 2005. [PDF]


Downloads

thesis

 ppt

Layer Extraction, Removal and Completion of Indoor Videos: A Tracking Based Approach


Vardhman Jain (homepage)

Image segmentation and layer extraction in video refer to the process of segmenting the image or video frames into various constituent objects. Automatic techniques for these are not always suitable, as the objective is often difficult to describe. With the advent of interactive techniques in the field, these algorithms are now usable for selecting an object of interest in an image or video precisely with less efforts. Object segmentation brings up various other possibilities like cut and paste of objects from one image or video to another. Object removal in image and videos is another application of interest. As the name suggest the task is to eliminate an object from the image or video. This involves recovering the information of the background previously occluded by the object. Object removal in both image and videos have found interesting applications especially in the entertainment industry. The concept of filling-in of information from the surrounding region for images and surrounding frames for videos has been applied for recovering damaged images or clips. This thesis presents two new approaches. The first is for object segmentation or layer extraction from a video. This method allows segmenting complex objects in videos, which can have difficult motion model. The algorithm integrates a robust points tracking algorithm to a 3D graph cuts formulation. Tracking is used for propagating the user given seeds in key frames to the intermediate frames which helps to provide better initialization to the graph cuts optimization. The second is an approach for video completion in indoor scenes. We propose a novel method for video completion using multiview information without applying a full frame or complete motion segmentation. The heart of the algorithm is a method to partition the scenes into regions supporting multiple homographies based on a geometric formulation and thereby providing precise segmentation even at the points where the actual scene information is missing due to the removal of the object. We demonstrate our algorithms on a number of representative videos. We also present a few directions for future work that extends the work presented here.

 

Year of completion:  2006
 Advisor :

P. J. Narayanan


Related Publications

  • Vardhman Jain and P.J. Narayanan - Layer Extraction Using Graph Cuts and Feature Tracking, The 3rd International Conference on Visual Information Engineering 26-28 September 2006 in Bangalore, India. [PDF]

  • Vardhman Jain and P. J. Narayanan - Video Completion for Indoor Scenes, 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.409-420, 2006. [PDF]


Downloads

thesis

 ppt

DGTk: A Data Generation Tool for CV and IBR


V. Vamsi Krishna

Computer Vision (CV) and Image Based Rendering (IBR) are the fields which have emerged in search of a means to make the computers understand the images like humans and the never ending pursuit of the Computer Graphics community to achieve photo realistic rendering. Though each of these fields deal with a completely different problems, both CV and IBR algorithms require high quality ground-truth information about the scenes they are applied on. Traditionally research groups have spent large amounts of resources on creating data using high-resolution equipment for qualitative analysis of CV and IBR algorithms. Such high quality data provided a platform for comparison of CV and IBR algorithms. Though these datasets have enabled comparison of algorithms, during the past decade, the development in the fields of CV and IBR have outpaced the ability of such standard datasets to differentiate among the best performing algorithms. All the resources invested for generating these datasets become wasted. To overcome this problem, researchers have resorted to creating synthetic datasets by extending existing 3D authoring tools, developing stand alone tools for generating synthetic data and developing novel methods of data acquisition for acquiring high quality real world data. The disadvantage of acquiring data using high resolution equipment include (1) Time required for setting up the configuration of equipment, (2) Errors in measuring devices due to physical limitations, (3) Repeatability of experiments due to un-controllable parameters like wind, fog, rain etc. Synthetic data is preferred for the early testing of algorithms, since they make qualitative and quantitative analysis possible. The performance of an algorithm on synthetic data generally provides a good indication of it's performance on the real world data.. (more...)

 

Year of completion:  2006
 Advisor : P. J. Narayanan

Related Publications

  • Vamsikrishna and P.J. Narayanan - Data Generation Toolkit for Image Based Rendering Algorithms , The 3rd International Conference on Visual Information Engineering 26-28 September 2006 in Bangalore, India. [PDF]


Downloads

thesis

 ppt

Robust Registration For Video Mosaicing Using Camera Motion Properties


Pulkit Parikh (Home Page)

In recent years, video mosaicing has emerged as an important problem in the domain of computer vision and computer graphics. Video mosaics find applications in many popular arenas including video compression, virtual environments and panoramic photography. Mosaicing is the process of generating a single, large, integrated image by combining the visual clues from multiple images. The composite image called mosaic provides high field of view without compromising the image resolution. When the input images are the frames of a video, the process is called video mosaicing. The process of mosaicing primarily consists of two key components: image registration and image compositing. The focus of this work is on image registration - the process of estimating a transformation that relates the frames of the video. In addition to mosaicing, registration has a wide range of other applications in ortho-rectification, scene transfer, video in-painting, etc. We employ homography, the general 2D transformation, for image registration. Typically, homography estimation is done from a set of matched feature points, extracted from the frames of the input video.

Video mosaicing has been viewed traditionally as a problem of registering (and stitching) only successive video frames. While some of the recent global alignment approaches make use of the information stemming from non-consecutive pairs of video frames, the registration of each frame pair is typically done independently. No real emphasis has been laid on how the imaging process and the camera motion relate the pair-wise homographies. Therefore, accurate registration and in turn, mosaicing, especially, in face of poor feature point correspondence, still remains a challenging task. For example, mosaicing a desert video wherein the frames have minimal texture to provide feature points (e.g. corners) or mosaicing in presence of repetitive texture leading to several mismatched points is highly error-prone.

It is known that the camera that captures the video frames to be mosaiced, does not undergo arbitrary motion. The trajectory of the camera is almost always smooth. Some examples where such motion is seen are aerial videos taken by the camera mounted on an aircraft, videos captured by robots, automated vehicles, etc. We propose an approach which exploits this smoothness constraint to refine outlier homographies i.e., homographies that are detected to be erroneous. In many scenarios, especially in machine-controlled environments, the camera motion, apart from being continuous, also follows a model (say a linear motion model), giving us much more concrete and precise information. In this thesis, we derive relationships between homographies in a video sequence, under many practical scenarios, for various camera motion models. In other words, we translate the camera motion model into a global homography model whose parameters can characterize homographies between every pair of frames. We present a generic methodology which uses this global homography model for accurate registration.

Above mentioned derivations and algorithms have been implemented, verified and tested, on various datasets. The analysis and comparative results demonstrate significant improvement over the existing approaches, in terms of accuracy and robustness. Superior quality mosaics have been developed using our algorithms, in presence of many commonly observed issues like texture-less frames and frames containing repetitive texture, with applicability to indoor (e.g., mosaicing a map) as well as outdoor (e.g., representing an aerial video as a mosaic) settings. For quantitative evaluation of mosaics, we have devised a novel, computationally efficient quality measure. The quantitative results are completely in tune with the visual perception, further testifying the superiority of our approach. In a nutshell, the premise that the properties of the camera motion can serve as an important additional cue for improved video mosaicing, has been empirically validated.

 

Year of completion:  2007
 Advisor :

C. V. Jawahar


Related Publications

  • Pulkit Parikh and C.V. Jawahar - Enhanced Video Mosaicing using Camera Motion Properties Proc. of IEEE Workshop on Motion and Video Computing(WMVC 2007), Austin, Texas, USA, 2007. [PDF]

  • Pulkit Parikh and C.V. Jawahar - Motion constraints For Video Mosaicing, The 3rd International Conference on Visual Information Engineering 26-28 September 2006 in Bangalore, India. [PDF]

 


Downloads

thesis

 ppt

Enhancing Weak Biometric Authentication by Adaptation and Improved User-Discrimination


Vandana Roy

Biometric technologies are becoming the foundation of an extensive array of person identification and verification solutions. Biometrics is defined as the science of recognising a person based on certain physiological (fingerprints, face, hand-geometry) or behavioral (voice, gait, keystrokes) characteristics. Weak biometrics (hand-geometry, face, voice) are the traits which possess low discriminating content; they change over time for each individual. Thus they show lower performance as compared to the strong biometrics (eg. fingerprints, iris, retina, etc.). Due to exponentially decreasing costs of the hardware and computations, biometrics has found immense use in civilian applications (Time and Attendance Monitoring, Physical Access to Building, Human-Computer Interface, etc.) other than the forensics ones (e.g. criminal and terrorist identification). Various factors come into picture while selecting biometric traits for civilian applications, most important of which are user psychology and acceptability. Most of the weak biometric traits have little or no association with criminal history as against fingerprints (a strong biometric); data acquisition is also very simple and easy with weak biometrics. Due to these reasons, weak biometric traits are often better accented for civilian applications than the strong biometric traits. Moreover, not much research has gone into this area as compared to strong biometrics.

Due to the low discriminating content of the weak biometric traits, they result in poor performance of verification. We propose a feature selection technique called Single Class Hierarchical Discriminant Analysis (SCHDA) specifically for authentication purpose in biometric systems. The SCDHA recursively identifies the samples which overlap with the samples of the claimed identity in the discriminant space built by the single-class discriminant criterion. If samples of claimed identity are termed ``positive'' samples, and all the other samples ``negative'' samples, the single-class discriminant criterion finds an optimal transformation such that the ratio of the negative scatter with respect to positive mean over the positive within-class scatter is maximized, thereby pulling together the positive samples and pushing the negative samples away from the positive mean. Thus SCHDA results in building an optimal user-specific discriminant space for each individual where the samples of the claimed identity are well-separated from the samples of all the other users. Performance of authentication using this technique is compared with the other popular existing discriminant analysis techniques in the literature and significant improvement has been observed.

The second problem which leads to low accuracy of authentication is the poor stability of permanence of weak biometric traits due to various reasons (eg. ageing, the person gaining or losing weight, etc.). Civilian applications usually operate in cooperative or monitored mode wherein the users can give feedback to the system on occurrence of any errors. An intelligent adaptive framework is proposed which uses the feedback to incrementally update the parameters of the feature selection and verification framework for the individuals. This technique does not require the system to be re-trained to address the issue of changing features.

The third factor which has been explored to improve the performance of authentication for civilian applications is the pattern of participation of the enrolled users. As the new users are enrolled into the system, a degradation is observed in performance due to the increasing number of users. Traditionally, it is required to re-train the system periodically with the existing users to take care of this issue. An interesting observation is that although the number of users enrolled into the system can be very high, the number of users which regularly participate in the authentication process is comparatively low. Thus, modeling the variation in participating population helps to bypass the re-training process. We propose to model the variation in participating population using the Markov models. Using these models, the prior probability of participation of each individual is computed and incorporated into the traditional feature selection framework, providing more relevance to the parameters of regularly participating users. Both the structured and unstructured modes of variation of participation were explored. Experiments were conducted on varied datasets, verifying our claim that incorporating prior probability of participation helps to improve performance of a biometric system over time.

In order to validate our claims and techniques, we used hand-geometry and keystrokes-based biometric traits. The hand-images were acquired using a simple low-cost setup consisting of a digital camera and a flat translucent platform with five rigid pegs (to assure that the images acquired are well-aligned). The platform is illuminated from beneath so as to simplify the preprocessing of the acquired images. The features used for hand-geometry includes lengths of four fingers, and widths at five equidistant points on each finger. Features of thumb are not used as these measurements for thumb show high variability for the same user. This dataset was used to validate the proposed feature selection technique. For keystrokes-based biometrics, the features used were the dwell time (duration of key-press event) and flight time (duration between key-release and next key-press events)of each key, and the number of times backspace and delete key were pressed. Data was collected from subjects who were not accustomed to a particular kind of keyboard (French Keyboard). The features extracted from this dataset were time-varying and was used to validate the concept of incremental updation.

In this thesis, we identify and address some of the issues which lead to low performance of authentication using certain weak biometric traits. We also look into the problem of low performance of authentication in large-scale biometrics for civilian applications.

 

Year of completion:  2007
 Advisor : C. V. Jawahar

Related Publications

  • Vandana Roy and C. V. Jawahar - Modeling Time-Varying Population for Biometric Authentication In International Conference on computing: Theory and Applications(ICCTA), Kolkatta, 2007. [PDF]

  • Vandana Roy and C. V. Jawahar, - Hand-Geometry Based Person Authentication Using Incremental Biased Discriminant Analysis, Proceedings of the National Conference on Communication(NCC 2006), Jan 2006 Delhi, January 2006, pp 261-265. [PDF]

  • Vandana Roy and C. V. Jawahar, - Feature Selection for Hand-Geometry based Person Authentication, Proceedings of the Thirteenth International Conference on Advanced Computing and Communications, Coimbatore, December 2005. [PDF]


Downloads

thesis

 ppt

 

 

More Articles …

  1. Vision based Robot Navigation using an On-line Visual Experience
  2. Kernel Methods and Factorization for Image and Video Analysis
  3. Imaging and Depth Estimation in an Optimization Framework
  4. Automatic Writer Identification and Verification using Online Handwriting
  • Start
  • Prev
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • Next
  • End
  1. You are here:  
  2. Home
  3. Research
  4. MS Thesis
  5. Thesis Students
Bootstrap is a front-end framework of Twitter, Inc. Code licensed under MIT License. Font Awesome font licensed under SIL OFL 1.1.