CVIT Home CVIT Home
  • Home
  • People
    • Faculty
    • Staff
    • PhD Students
    • MS Students
    • Alumni
    • Post-doctoral
    • Honours Student
  • Research
    • Publications
    • Thesis
    • Projects
    • Resources
  • Events
    • Talks and Visits
    • Major Events
    • Visitors
    • Summer Schools
  • Gallery
  • News & Updates
    • News
    • Blog
    • Newsletter
    • Banners
  • Contact Us
  • Login

DGTk: A Data Generation Tool for CV and IBR


V. Vamsi Krishna

Computer Vision (CV) and Image Based Rendering (IBR) are the fields which have emerged in search of a means to make the computers understand the images like humans and the never ending pursuit of the Computer Graphics community to achieve photo realistic rendering. Though each of these fields deal with a completely different problems, both CV and IBR algorithms require high quality ground-truth information about the scenes they are applied on. Traditionally research groups have spent large amounts of resources on creating data using high-resolution equipment for qualitative analysis of CV and IBR algorithms. Such high quality data provided a platform for comparison of CV and IBR algorithms. Though these datasets have enabled comparison of algorithms, during the past decade, the development in the fields of CV and IBR have outpaced the ability of such standard datasets to differentiate among the best performing algorithms. All the resources invested for generating these datasets become wasted. To overcome this problem, researchers have resorted to creating synthetic datasets by extending existing 3D authoring tools, developing stand alone tools for generating synthetic data and developing novel methods of data acquisition for acquiring high quality real world data. The disadvantage of acquiring data using high resolution equipment include (1) Time required for setting up the configuration of equipment, (2) Errors in measuring devices due to physical limitations, (3) Repeatability of experiments due to un-controllable parameters like wind, fog, rain etc. Synthetic data is preferred for the early testing of algorithms, since they make qualitative and quantitative analysis possible. The performance of an algorithm on synthetic data generally provides a good indication of it's performance on the real world data.. (more...)

 

Year of completion:  2006
 Advisor : P. J. Narayanan

Related Publications

  • Vamsikrishna and P.J. Narayanan - Data Generation Toolkit for Image Based Rendering Algorithms , The 3rd International Conference on Visual Information Engineering 26-28 September 2006 in Bangalore, India. [PDF]


Downloads

thesis

 ppt

Robust Registration For Video Mosaicing Using Camera Motion Properties


Pulkit Parikh (Home Page)

In recent years, video mosaicing has emerged as an important problem in the domain of computer vision and computer graphics. Video mosaics find applications in many popular arenas including video compression, virtual environments and panoramic photography. Mosaicing is the process of generating a single, large, integrated image by combining the visual clues from multiple images. The composite image called mosaic provides high field of view without compromising the image resolution. When the input images are the frames of a video, the process is called video mosaicing. The process of mosaicing primarily consists of two key components: image registration and image compositing. The focus of this work is on image registration - the process of estimating a transformation that relates the frames of the video. In addition to mosaicing, registration has a wide range of other applications in ortho-rectification, scene transfer, video in-painting, etc. We employ homography, the general 2D transformation, for image registration. Typically, homography estimation is done from a set of matched feature points, extracted from the frames of the input video.

Video mosaicing has been viewed traditionally as a problem of registering (and stitching) only successive video frames. While some of the recent global alignment approaches make use of the information stemming from non-consecutive pairs of video frames, the registration of each frame pair is typically done independently. No real emphasis has been laid on how the imaging process and the camera motion relate the pair-wise homographies. Therefore, accurate registration and in turn, mosaicing, especially, in face of poor feature point correspondence, still remains a challenging task. For example, mosaicing a desert video wherein the frames have minimal texture to provide feature points (e.g. corners) or mosaicing in presence of repetitive texture leading to several mismatched points is highly error-prone.

It is known that the camera that captures the video frames to be mosaiced, does not undergo arbitrary motion. The trajectory of the camera is almost always smooth. Some examples where such motion is seen are aerial videos taken by the camera mounted on an aircraft, videos captured by robots, automated vehicles, etc. We propose an approach which exploits this smoothness constraint to refine outlier homographies i.e., homographies that are detected to be erroneous. In many scenarios, especially in machine-controlled environments, the camera motion, apart from being continuous, also follows a model (say a linear motion model), giving us much more concrete and precise information. In this thesis, we derive relationships between homographies in a video sequence, under many practical scenarios, for various camera motion models. In other words, we translate the camera motion model into a global homography model whose parameters can characterize homographies between every pair of frames. We present a generic methodology which uses this global homography model for accurate registration.

Above mentioned derivations and algorithms have been implemented, verified and tested, on various datasets. The analysis and comparative results demonstrate significant improvement over the existing approaches, in terms of accuracy and robustness. Superior quality mosaics have been developed using our algorithms, in presence of many commonly observed issues like texture-less frames and frames containing repetitive texture, with applicability to indoor (e.g., mosaicing a map) as well as outdoor (e.g., representing an aerial video as a mosaic) settings. For quantitative evaluation of mosaics, we have devised a novel, computationally efficient quality measure. The quantitative results are completely in tune with the visual perception, further testifying the superiority of our approach. In a nutshell, the premise that the properties of the camera motion can serve as an important additional cue for improved video mosaicing, has been empirically validated.

 

Year of completion:  2007
 Advisor :

C. V. Jawahar


Related Publications

  • Pulkit Parikh and C.V. Jawahar - Enhanced Video Mosaicing using Camera Motion Properties Proc. of IEEE Workshop on Motion and Video Computing(WMVC 2007), Austin, Texas, USA, 2007. [PDF]

  • Pulkit Parikh and C.V. Jawahar - Motion constraints For Video Mosaicing, The 3rd International Conference on Visual Information Engineering 26-28 September 2006 in Bangalore, India. [PDF]

 


Downloads

thesis

 ppt

Enhancing Weak Biometric Authentication by Adaptation and Improved User-Discrimination


Vandana Roy

Biometric technologies are becoming the foundation of an extensive array of person identification and verification solutions. Biometrics is defined as the science of recognising a person based on certain physiological (fingerprints, face, hand-geometry) or behavioral (voice, gait, keystrokes) characteristics. Weak biometrics (hand-geometry, face, voice) are the traits which possess low discriminating content; they change over time for each individual. Thus they show lower performance as compared to the strong biometrics (eg. fingerprints, iris, retina, etc.). Due to exponentially decreasing costs of the hardware and computations, biometrics has found immense use in civilian applications (Time and Attendance Monitoring, Physical Access to Building, Human-Computer Interface, etc.) other than the forensics ones (e.g. criminal and terrorist identification). Various factors come into picture while selecting biometric traits for civilian applications, most important of which are user psychology and acceptability. Most of the weak biometric traits have little or no association with criminal history as against fingerprints (a strong biometric); data acquisition is also very simple and easy with weak biometrics. Due to these reasons, weak biometric traits are often better accented for civilian applications than the strong biometric traits. Moreover, not much research has gone into this area as compared to strong biometrics.

Due to the low discriminating content of the weak biometric traits, they result in poor performance of verification. We propose a feature selection technique called Single Class Hierarchical Discriminant Analysis (SCHDA) specifically for authentication purpose in biometric systems. The SCDHA recursively identifies the samples which overlap with the samples of the claimed identity in the discriminant space built by the single-class discriminant criterion. If samples of claimed identity are termed ``positive'' samples, and all the other samples ``negative'' samples, the single-class discriminant criterion finds an optimal transformation such that the ratio of the negative scatter with respect to positive mean over the positive within-class scatter is maximized, thereby pulling together the positive samples and pushing the negative samples away from the positive mean. Thus SCHDA results in building an optimal user-specific discriminant space for each individual where the samples of the claimed identity are well-separated from the samples of all the other users. Performance of authentication using this technique is compared with the other popular existing discriminant analysis techniques in the literature and significant improvement has been observed.

The second problem which leads to low accuracy of authentication is the poor stability of permanence of weak biometric traits due to various reasons (eg. ageing, the person gaining or losing weight, etc.). Civilian applications usually operate in cooperative or monitored mode wherein the users can give feedback to the system on occurrence of any errors. An intelligent adaptive framework is proposed which uses the feedback to incrementally update the parameters of the feature selection and verification framework for the individuals. This technique does not require the system to be re-trained to address the issue of changing features.

The third factor which has been explored to improve the performance of authentication for civilian applications is the pattern of participation of the enrolled users. As the new users are enrolled into the system, a degradation is observed in performance due to the increasing number of users. Traditionally, it is required to re-train the system periodically with the existing users to take care of this issue. An interesting observation is that although the number of users enrolled into the system can be very high, the number of users which regularly participate in the authentication process is comparatively low. Thus, modeling the variation in participating population helps to bypass the re-training process. We propose to model the variation in participating population using the Markov models. Using these models, the prior probability of participation of each individual is computed and incorporated into the traditional feature selection framework, providing more relevance to the parameters of regularly participating users. Both the structured and unstructured modes of variation of participation were explored. Experiments were conducted on varied datasets, verifying our claim that incorporating prior probability of participation helps to improve performance of a biometric system over time.

In order to validate our claims and techniques, we used hand-geometry and keystrokes-based biometric traits. The hand-images were acquired using a simple low-cost setup consisting of a digital camera and a flat translucent platform with five rigid pegs (to assure that the images acquired are well-aligned). The platform is illuminated from beneath so as to simplify the preprocessing of the acquired images. The features used for hand-geometry includes lengths of four fingers, and widths at five equidistant points on each finger. Features of thumb are not used as these measurements for thumb show high variability for the same user. This dataset was used to validate the proposed feature selection technique. For keystrokes-based biometrics, the features used were the dwell time (duration of key-press event) and flight time (duration between key-release and next key-press events)of each key, and the number of times backspace and delete key were pressed. Data was collected from subjects who were not accustomed to a particular kind of keyboard (French Keyboard). The features extracted from this dataset were time-varying and was used to validate the concept of incremental updation.

In this thesis, we identify and address some of the issues which lead to low performance of authentication using certain weak biometric traits. We also look into the problem of low performance of authentication in large-scale biometrics for civilian applications.

 

Year of completion:  2007
 Advisor : C. V. Jawahar

Related Publications

  • Vandana Roy and C. V. Jawahar - Modeling Time-Varying Population for Biometric Authentication In International Conference on computing: Theory and Applications(ICCTA), Kolkatta, 2007. [PDF]

  • Vandana Roy and C. V. Jawahar, - Hand-Geometry Based Person Authentication Using Incremental Biased Discriminant Analysis, Proceedings of the National Conference on Communication(NCC 2006), Jan 2006 Delhi, January 2006, pp 261-265. [PDF]

  • Vandana Roy and C. V. Jawahar, - Feature Selection for Hand-Geometry based Person Authentication, Proceedings of the Thirteenth International Conference on Advanced Computing and Communications, Coimbatore, December 2005. [PDF]


Downloads

thesis

 ppt

 

 

Vision based Robot Navigation using an On-line Visual Experience


D. Santosh Kumar (homepage)

Vision-based robot navigation has long been a fundamental goal in both robotics and computer vision research. While the problem is largely solved for robots equipped with active range-finding devices, for a variety of reasons, the task still remains challenging for robots equipped only with vision sensors. Vision is an attractive sensor as it helps in the design of economically viable systems with simpler sensor limitations. It facilitates passive sensing of the environment and provides valuable semantic information about the scene that is unavailable to other sensors. Two popular paradigms have emerged to analyze this problem, namely Model-based and Model-free algorithms. Model-based approaches demand apriori model information to be made available in advance. In case of the latter, required 3D information is computed online. Model-free navigation paradigms have gained popularity over modelbased approaches due to their simpler assumptions and wider applicability. This thesis discusses a new paradigm to vision-based navigation, namely Image-based navigation. The basic concept is that model-free paradigms involve an unnecessary intermediate depth computation, which is redundant for the purpose of navigation. Rather the motion instruction required to control the robot can be inferred directly from the acquired images. This approach is more attractive as the modeling of objects is now simply substituted by the memorization of views, which is far easier than 3D modeling.

santoshThesis

In this thesis, a new image-based navigation architecture is developed, which facilitates online learning about the world by a robot. The framework capacitates a robot to autonomously explore and navigate a variety of unknown environments, in a way that facilitates path planning and goal-oriented tasks, using visual maps that are contextually built in the process. It also facilitates the incorporation of feedback received from performing specific goal oriented tasks to update the visual representation. Based on this architecture, the design of the individual algorithms required for performing the navigation task (exploration, servoing and learning) is discussed. (more...) 

 

Year of completion: 

June 2007

 Advisor : C. V. Jawahar

Related Publications

  • D. Santohs and C.V. Jawahar - Visual Servoing in Non-Regid Environment: A Space-Time Approach Proc. of IEEE International Conference on Robotics and Automation(ICRA'07), Roma, Italy, 2007. [PDF]

  • D. Santosh Kumar and C.V. Jawahar - Visual Servoing in Presence of Non-Rigid Motion, Proc. 18th IEEE International Conference on Pattern Recognition(ICPR'06), Hong Kong, Aug 2006. [PDF]

  • D. Santosh Kumar and C.V. Jawahar - Robust Homography-Based Control for Camera Positioning in Piecewise Planar Environment , 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.906-918, 2006. [PDF]

  • D. Santosh and C.V. Jawahar - Cooperative CONDENSATION-based Recognition, in 8th Asian Conference on Computer Vision (ACCV) (Under Review), 2007.
  • D. Santosh and C.V. Jawahar - Visual Servoing in Non-Rigid Environments, in IEEE Transactions on Robotics (ITRO) (Under Submission), 2008.
  • D. Santosh and C.V. Jawahar - Robot Path Planning by Reinforcement Learning along with Potential Fields, in 25th IEEE International Conference on Robotics and Automation (ICRA) (Under Submission) , 2008.
  • D. Santosh, A. Supreeth and C.V. Jawahar - Mobile Robot Exploration and Navigation using a Single Camera, in 25th IEEE International Conference on Robotics and Automation (ICRA) (Under Submission) , 2008.

Downloads

thesis

 ppt

Kernel Methods and Factorization for Image and Video Analysis


Ranjeeth Kumar Dasineni (homepage)

Image and Video Analysis is one of the most active research areas in computer science with a large number of applications in security, surveillance, broadcast video processing etc. Prior to the past two decades, the primary focus in this domain was on efficient processing of image and video data. However, with the increase in computational power and advancements in Machine Learning, the focus has shifted to a wide range of other problems. Machine learning techniques have been widely used to perform higher level tasks such as recognizing faces from images, facial expression analysis in videos, printed document recognition and video understanding which require extensive analysis of data. The field of Machine Learning itself, witnessed the evolution of Kernel Methods as a principled and efficient approach to analyze nonlinear relationships in the data. The new algorithms are computationally efficient and statistically stable. This is in stark contrast with the previous methods used for nonlinear problems, such as neural networks and decision trees, which often suffered from overfitting and computational expense. In addition, kernel methods provide a natural way to treat heterogeneous data (like categorical data, graphs and sequences) under a unified framework. These advantages led to their immense popularity in many fields, such as computer vision, data mining and bioinformatics. In computer vision, the use of kernel methods such as support vector machine, kernel principal component analysis and kernel discriminant analysis resulted in remarkable improvements in performance at tasks such as classification, recognition and feature extraction. Like Kernel Methods, Factorization techniques enabled elegant solutions to many problems in computer vision such as eliminating redundancy in representation of data and analysis of their generative processes. Structure from Motion and Eigen Faces for feature extraction are examples of successful applications of factorization in vision. However, factorization, so far, has been used on the traditional matrix representation of image collection and videos. This representation fails to completely exploit the structure in 2D images as each image is represented using a single 1D vector. Tensors are more natural representations for such data and recently gained wide attention in computer vision. Factorization becomes an even more useful tool with such representations.

thesis

While both Kernel Methods and Factorization both aid in analysis of the data and detection of inherent regularities, they do so in orthogonal manner. The central idea in kernel methods is to work with new sets of features derived from the input set of features. Factorization, on the other hand, operates by eliminating redundant or irrelevant information. Thus, they form a complementary set of tools to analyze data. This thesis addresses the problem of effective manipulation of dimensionality of representation of visual data, using these tools, for solving problems in image analysis. The purpose of this thesis is three fold: i) Demonstrating useful applications of kernel methods to problems in image analysis. New kernel algorithms are developed for feature selection and time series modeling. These are used for biometric authentication using weak features, planar shape recognition and handwritten character recognition. ii) Using the tensor representation and factorization of tensors to solve challenging problems in facial video analysis. These are used to develop simple and efficient methods to perform expression transfer, expression recognition and face morphing. iii) Investigating and demonstrating the complementary nature of Kernelization and Factorization and how they can be used together for analysis of the data. (more...)

 

Year of completion:  December 2007
 Advisor :

C. V. Jawahar


Related Publications

  • Ranjeeth Kumar and C.V. Jawahar - Kernel Approach to Autoregressive Modeling Proc. of The Thirteen National Conference on Communications(NCC 2007), Kanpur, 2007 . [PDF]

  • Ranjeeth Kumar and C.V. Jawahar - Class-Specific Kernel Selection for Verification Problems Proc. of The Six International Conference on Advances in Pattern Recognition(ICAPR 2007), Kolkatta, 2007. [PDF]

  • S. Manikandan, Ranjeeth Kumar and C.V. Jawahar - Tensorial Factorization Methods for Manipulation of Face Videos, The 3rd International Conference on Visual Information Engineering 26-28 September 2006 in Bangalore, India. [PDF]

  • Ranjeeth Kumar, S. Manikandan and C. V. Jawahar - Task Specific Factors for Video Characterization, 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.376-387, 2006. [PDF]

  • Ranjeeth Kumar, S. Manikandan and C. V. Jawahar - Face Video Alteration Using Tensorial Methods, in Pattern Recognition, Journal of Pattern Recognition Society (Submitted)

Downloads

thesis

 ppt

 

More Articles …

  1. Imaging and Depth Estimation in an Optimization Framework
  2. Automatic Writer Identification and Verification using Online Handwriting
  3. Word Hashing for Efficient Search In Document Image Collections
  4. Proxy Based Compression of Depth Movies
  • Start
  • Prev
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • Next
  • End
  1. You are here:  
  2. Home
  3. Research
  4. Thesis
  5. Thesis Students
Bootstrap is a front-end framework of Twitter, Inc. Code licensed under MIT License. Font Awesome font licensed under SIL OFL 1.1.