Thesis Students

Layer Extraction, Removal and Completion of Indoor Videos: A Tracking Based Approach

Vardhman Jain (homepage)

Image segmentation and layer extraction in video refer to the process of segmenting the image or video frames into various constituent objects. Automatic techniques for these are not always suitable, as the objective is often difficult to describe. With the advent of interactive techniques in the field, these algorithms are now usable for selecting an object of interest in an image or video precisely with less efforts. Object segmentation brings up various other possibilities like cut and paste of objects from one image or video to another. Object removal in image and videos is another application of interest. As the name suggest the task is to eliminate an object from the image or video. This involves recovering the information of the background previously occluded by the object. Object removal in both image and videos have found interesting applications especially in the entertainment industry. The concept of filling-in of information from the surrounding region for images and surrounding frames for videos has been applied for recovering damaged images or clips. This thesis presents two new approaches. The first is for object segmentation or layer extraction from a video. This method allows segmenting complex objects in videos, which can have difficult motion model. The algorithm integrates a robust points tracking algorithm to a 3D graph cuts formulation. Tracking is used for propagating the user given seeds in key frames to the intermediate frames which helps to provide better initialization to the graph cuts optimization. The second is an approach for video completion in indoor scenes. We propose a novel method for video completion using multiview information without applying a full frame or complete motion segmentation. The heart of the algorithm is a method to partition the scenes into regions supporting multiple homographies based on a geometric formulation and thereby providing precise segmentation even at the points where the actual scene information is missing due to the removal of the object. We demonstrate our algorithms on a number of representative videos. We also present a few directions for future work that extends the work presented here.

Year of completion:	2006
Advisor :	P. J. Narayanan

Related Publications

Vardhman Jain and P.J. Narayanan - Layer Extraction Using Graph Cuts and Feature Tracking, The 3rd International Conference on Visual Information Engineering 26-28 September 2006 in Bangalore, India. [PDF]
Vardhman Jain and P. J. Narayanan - Video Completion for Indoor Scenes, 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.409-420, 2006. [PDF]

Downloads

ppt

DGTk: A Data Generation Tool for CV and IBR

V. Vamsi Krishna

Computer Vision (CV) and Image Based Rendering (IBR) are the fields which have emerged in search of a means to make the computers understand the images like humans and the never ending pursuit of the Computer Graphics community to achieve photo realistic rendering. Though each of these fields deal with a completely different problems, both CV and IBR algorithms require high quality ground-truth information about the scenes they are applied on. Traditionally research groups have spent large amounts of resources on creating data using high-resolution equipment for qualitative analysis of CV and IBR algorithms. Such high quality data provided a platform for comparison of CV and IBR algorithms. Though these datasets have enabled comparison of algorithms, during the past decade, the development in the fields of CV and IBR have outpaced the ability of such standard datasets to differentiate among the best performing algorithms. All the resources invested for generating these datasets become wasted. To overcome this problem, researchers have resorted to creating synthetic datasets by extending existing 3D authoring tools, developing stand alone tools for generating synthetic data and developing novel methods of data acquisition for acquiring high quality real world data. The disadvantage of acquiring data using high resolution equipment include (1) Time required for setting up the configuration of equipment, (2) Errors in measuring devices due to physical limitations, (3) Repeatability of experiments due to un-controllable parameters like wind, fog, rain etc. Synthetic data is preferred for the early testing of algorithms, since they make qualitative and quantitative analysis possible. The performance of an algorithm on synthetic data generally provides a good indication of it's performance on the real world data.. (more...)

Year of completion:	2006
Advisor :	P. J. Narayanan

Related Publications

Vamsikrishna and P.J. Narayanan - Data Generation Toolkit for Image Based Rendering Algorithms , The 3rd International Conference on Visual Information Engineering 26-28 September 2006 in Bangalore, India. [PDF]

Downloads

ppt

Robust Registration For Video Mosaicing Using Camera Motion Properties

Pulkit Parikh (Home Page)

In recent years, video mosaicing has emerged as an important problem in the domain of computer vision and computer graphics. Video mosaics find applications in many popular arenas including video compression, virtual environments and panoramic photography. Mosaicing is the process of generating a single, large, integrated image by combining the visual clues from multiple images. The composite image called mosaic provides high field of view without compromising the image resolution. When the input images are the frames of a video, the process is called video mosaicing. The process of mosaicing primarily consists of two key components: image registration and image compositing. The focus of this work is on image registration - the process of estimating a transformation that relates the frames of the video. In addition to mosaicing, registration has a wide range of other applications in ortho-rectification, scene transfer, video in-painting, etc. We employ homography, the general 2D transformation, for image registration. Typically, homography estimation is done from a set of matched feature points, extracted from the frames of the input video.

Video mosaicing has been viewed traditionally as a problem of registering (and stitching) only successive video frames. While some of the recent global alignment approaches make use of the information stemming from non-consecutive pairs of video frames, the registration of each frame pair is typically done independently. No real emphasis has been laid on how the imaging process and the camera motion relate the pair-wise homographies. Therefore, accurate registration and in turn, mosaicing, especially, in face of poor feature point correspondence, still remains a challenging task. For example, mosaicing a desert video wherein the frames have minimal texture to provide feature points (e.g. corners) or mosaicing in presence of repetitive texture leading to several mismatched points is highly error-prone.

It is known that the camera that captures the video frames to be mosaiced, does not undergo arbitrary motion. The trajectory of the camera is almost always smooth. Some examples where such motion is seen are aerial videos taken by the camera mounted on an aircraft, videos captured by robots, automated vehicles, etc. We propose an approach which exploits this smoothness constraint to refine outlier homographies i.e., homographies that are detected to be erroneous. In many scenarios, especially in machine-controlled environments, the camera motion, apart from being continuous, also follows a model (say a linear motion model), giving us much more concrete and precise information. In this thesis, we derive relationships between homographies in a video sequence, under many practical scenarios, for various camera motion models. In other words, we translate the camera motion model into a global homography model whose parameters can characterize homographies between every pair of frames. We present a generic methodology which uses this global homography model for accurate registration.

Above mentioned derivations and algorithms have been implemented, verified and tested, on various datasets. The analysis and comparative results demonstrate significant improvement over the existing approaches, in terms of accuracy and robustness. Superior quality mosaics have been developed using our algorithms, in presence of many commonly observed issues like texture-less frames and frames containing repetitive texture, with applicability to indoor (e.g., mosaicing a map) as well as outdoor (e.g., representing an aerial video as a mosaic) settings. For quantitative evaluation of mosaics, we have devised a novel, computationally efficient quality measure. The quantitative results are completely in tune with the visual perception, further testifying the superiority of our approach. In a nutshell, the premise that the properties of the camera motion can serve as an important additional cue for improved video mosaicing, has been empirically validated.

Year of completion:	2007
Advisor :	C. V. Jawahar

Related Publications

Pulkit Parikh and C.V. Jawahar - Enhanced Video Mosaicing using Camera Motion Properties Proc. of IEEE Workshop on Motion and Video Computing(WMVC 2007), Austin, Texas, USA, 2007. [PDF]
Pulkit Parikh and C.V. Jawahar - Motion constraints For Video Mosaicing, The 3rd International Conference on Visual Information Engineering 26-28 September 2006 in Bangalore, India. [PDF]

Downloads

ppt

Enhancing Weak Biometric Authentication by Adaptation and Improved User-Discrimination

Vandana Roy

Biometric technologies are becoming the foundation of an extensive array of person identification and verification solutions. Biometrics is defined as the science of recognising a person based on certain physiological (fingerprints, face, hand-geometry) or behavioral (voice, gait, keystrokes) characteristics. Weak biometrics (hand-geometry, face, voice) are the traits which possess low discriminating content; they change over time for each individual. Thus they show lower performance as compared to the strong biometrics (eg. fingerprints, iris, retina, etc.). Due to exponentially decreasing costs of the hardware and computations, biometrics has found immense use in civilian applications (Time and Attendance Monitoring, Physical Access to Building, Human-Computer Interface, etc.) other than the forensics ones (e.g. criminal and terrorist identification). Various factors come into picture while selecting biometric traits for civilian applications, most important of which are user psychology and acceptability. Most of the weak biometric traits have little or no association with criminal history as against fingerprints (a strong biometric); data acquisition is also very simple and easy with weak biometrics. Due to these reasons, weak biometric traits are often better accented for civilian applications than the strong biometric traits. Moreover, not much research has gone into this area as compared to strong biometrics.

Due to the low discriminating content of the weak biometric traits, they result in poor performance of verification. We propose a feature selection technique called Single Class Hierarchical Discriminant Analysis (SCHDA) specifically for authentication purpose in biometric systems. The SCDHA recursively identifies the samples which overlap with the samples of the claimed identity in the discriminant space built by the single-class discriminant criterion. If samples of claimed identity are termed ``positive'' samples, and all the other samples ``negative'' samples, the single-class discriminant criterion finds an optimal transformation such that the ratio of the negative scatter with respect to positive mean over the positive within-class scatter is maximized, thereby pulling together the positive samples and pushing the negative samples away from the positive mean. Thus SCHDA results in building an optimal user-specific discriminant space for each individual where the samples of the claimed identity are well-separated from the samples of all the other users. Performance of authentication using this technique is compared with the other popular existing discriminant analysis techniques in the literature and significant improvement has been observed.

The second problem which leads to low accuracy of authentication is the poor stability of permanence of weak biometric traits due to various reasons (eg. ageing, the person gaining or losing weight, etc.). Civilian applications usually operate in cooperative or monitored mode wherein the users can give feedback to the system on occurrence of any errors. An intelligent adaptive framework is proposed which uses the feedback to incrementally update the parameters of the feature selection and verification framework for the individuals. This technique does not require the system to be re-trained to address the issue of changing features.

The third factor which has been explored to improve the performance of authentication for civilian applications is the pattern of participation of the enrolled users. As the new users are enrolled into the system, a degradation is observed in performance due to the increasing number of users. Traditionally, it is required to re-train the system periodically with the existing users to take care of this issue. An interesting observation is that although the number of users enrolled into the system can be very high, the number of users which regularly participate in the authentication process is comparatively low. Thus, modeling the variation in participating population helps to bypass the re-training process. We propose to model the variation in participating population using the Markov models. Using these models, the prior probability of participation of each individual is computed and incorporated into the traditional feature selection framework, providing more relevance to the parameters of regularly participating users. Both the structured and unstructured modes of variation of participation were explored. Experiments were conducted on varied datasets, verifying our claim that incorporating prior probability of participation helps to improve performance of a biometric system over time.

In order to validate our claims and techniques, we used hand-geometry and keystrokes-based biometric traits. The hand-images were acquired using a simple low-cost setup consisting of a digital camera and a flat translucent platform with five rigid pegs (to assure that the images acquired are well-aligned). The platform is illuminated from beneath so as to simplify the preprocessing of the acquired images. The features used for hand-geometry includes lengths of four fingers, and widths at five equidistant points on each finger. Features of thumb are not used as these measurements for thumb show high variability for the same user. This dataset was used to validate the proposed feature selection technique. For keystrokes-based biometrics, the features used were the dwell time (duration of key-press event) and flight time (duration between key-release and next key-press events)of each key, and the number of times backspace and delete key were pressed. Data was collected from subjects who were not accustomed to a particular kind of keyboard (French Keyboard). The features extracted from this dataset were time-varying and was used to validate the concept of incremental updation.

In this thesis, we identify and address some of the issues which lead to low performance of authentication using certain weak biometric traits. We also look into the problem of low performance of authentication in large-scale biometrics for civilian applications.

Year of completion:	2007
Advisor :	C. V. Jawahar

Related Publications

Vandana Roy and C. V. Jawahar - Modeling Time-Varying Population for Biometric Authentication In International Conference on computing: Theory and Applications(ICCTA), Kolkatta, 2007. [PDF]
Vandana Roy and C. V. Jawahar, - Hand-Geometry Based Person Authentication Using Incremental Biased Discriminant Analysis, Proceedings of the National Conference on Communication(NCC 2006), Jan 2006 Delhi, January 2006, pp 261-265. [PDF]
Vandana Roy and C. V. Jawahar, - Feature Selection for Hand-Geometry based Person Authentication, Proceedings of the Thirteenth International Conference on Advanced Computing and Communications, Coimbatore, December 2005. [PDF]

Downloads

ppt

Vision based Robot Navigation using an On-line Visual Experience

D. Santosh Kumar (homepage)

Vision-based robot navigation has long been a fundamental goal in both robotics and computer vision research. While the problem is largely solved for robots equipped with active range-finding devices, for a variety of reasons, the task still remains challenging for robots equipped only with vision sensors. Vision is an attractive sensor as it helps in the design of economically viable systems with simpler sensor limitations. It facilitates passive sensing of the environment and provides valuable semantic information about the scene that is unavailable to other sensors. Two popular paradigms have emerged to analyze this problem, namely Model-based and Model-free algorithms. Model-based approaches demand apriori model information to be made available in advance. In case of the latter, required 3D information is computed online. Model-free navigation paradigms have gained popularity over modelbased approaches due to their simpler assumptions and wider applicability. This thesis discusses a new paradigm to vision-based navigation, namely Image-based navigation. The basic concept is that model-free paradigms involve an unnecessary intermediate depth computation, which is redundant for the purpose of navigation. Rather the motion instruction required to control the robot can be inferred directly from the acquired images. This approach is more attractive as the modeling of objects is now simply substituted by the memorization of views, which is far easier than 3D modeling.

In this thesis, a new image-based navigation architecture is developed, which facilitates online learning about the world by a robot. The framework capacitates a robot to autonomously explore and navigate a variety of unknown environments, in a way that facilitates path planning and goal-oriented tasks, using visual maps that are contextually built in the process. It also facilitates the incorporation of feedback received from performing specific goal oriented tasks to update the visual representation. Based on this architecture, the design of the individual algorithms required for performing the navigation task (exploration, servoing and learning) is discussed. (more...)

Year of completion:	June 2007
Advisor :	C. V. Jawahar

Related Publications

D. Santohs and C.V. Jawahar - Visual Servoing in Non-Regid Environment: A Space-Time Approach Proc. of IEEE International Conference on Robotics and Automation(ICRA'07), Roma, Italy, 2007. [PDF]
D. Santosh Kumar and C.V. Jawahar - Visual Servoing in Presence of Non-Rigid Motion, Proc. 18th IEEE International Conference on Pattern Recognition(ICPR'06), Hong Kong, Aug 2006. [PDF]
D. Santosh Kumar and C.V. Jawahar - Robust Homography-Based Control for Camera Positioning in Piecewise Planar Environment , 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.906-918, 2006. [PDF]

D. Santosh and C.V. Jawahar - Cooperative CONDENSATION-based Recognition, in 8th Asian Conference on Computer Vision (ACCV) (Under Review), 2007.
D. Santosh and C.V. Jawahar - Visual Servoing in Non-Rigid Environments, in IEEE Transactions on Robotics (ITRO) (Under Submission), 2008.
D. Santosh and C.V. Jawahar - Robot Path Planning by Reinforcement Learning along with Potential Fields, in 25th IEEE International Conference on Robotics and Automation (ICRA) (Under Submission) , 2008.
D. Santosh, A. Supreeth and C.V. Jawahar - Mobile Robot Exploration and Navigation using a Single Camera, in 25th IEEE International Conference on Robotics and Automation (ICRA) (Under Submission) , 2008.

Downloads

ppt

Layer Extraction, Removal and Completion of Indoor Videos: A Tracking Based Approach

Related Publications

Downloads

DGTk: A Data Generation Tool for CV and IBR

Related Publications

Downloads

Robust Registration For Video Mosaicing Using Camera Motion Properties

Related Publications

Downloads

Enhancing Weak Biometric Authentication by Adaptation and Improved User-Discrimination

Related Publications

Downloads

Vision based Robot Navigation using an On-line Visual Experience

Related Publications

Downloads

More Articles …