Thesis Students

Exploiting the Graphics Hardware to solve two compute intensive problems: Singular Value Decomposition and Ray Tracing Parametric Patches

Sheetal Lahabar(homepage)

The rapid increase in the performance of graphics hardware have made the GPU a strong candidate for performing many compute intensive tasks, especially many data-parallel tasks. Off-the-shelf GPUs are rated at upwards of $2$ TFLOPs of peak power today. The high compute power of the modern Graphics Processing Units (GPUs) has been exploited in a wide variety of domain other than graphics. They have proven to be powerful computing work horses in various fields. We leverage the processing power of GPUs to solve two highly compute intensive problems: Singular Value Decomposition and Ray Tracing Parametric Patches. Singular Value Decomposition is an important matrix factorization technique used to decompose a matrix into several component matrices. It is used in a wide variety of application related to signal processing, scientific computing, etc. However, little work has been done to implement Singular Value Decomposition on parallel architectures. Direct ray tracing of parametric patches provides high geometric precision resulting in less artifacts and results in better rendering using true normals for lighting at every pixel. Researchers have developed algorithms to ray tracing parametric surfaces directly, but these algorithms are either computationally expensive or have other disadvantages. Both these problems are both computationally demanding and amenable to massively parallel implementations. Singular Value Decomposition computations requires factorizing a dense matrix into component matrices. Finding the intersection of a ray with a B\'{e}zier patch requires finding the roots of a $18$ degree polynomial. They require enormous number of floating point operations. The advent of GPUs with support for higher precision and their ever-growing amount of parallel horsepower makes them a tempting resource for such numerical computations. We propose parallel implementations for these problems on a GPU. They outperform the best CPU implementations available significantly.

We implement the Singular Value Decomposition of a dense matrix on GPUs using the two step Golub Reinsch algorithm. In the first step, bidiagonalization decomposes a given dense matrix into bidiagonal matrix and component orthogonal matrix using a series of Householder transformations. In the next step, diagonalization is performed by applying the implicitly shifted QR algorithm which takes $O(n)$ iterations. It decomposes the bidiagonal matrix into a diagonal matrix and orthogonal matrices. Each householder transformation reduces a row-column pair. We divide the number of householder transformations required into $n/L$ blocks. $L$ householder transformations are applied in parallel by mapping them to CUBLAS operations to derive maximum performance. We thus, exploit the underlying hardware to the maximum. We use a hybrid implementation for the diagonalization step that splits the computations between the CPU and the GPU. Every iteration of the QR algorithm applies Givens rotations on the elements of the bidiagonal matrix and the corresponding inverse rotations are applied on the rows of the orthogonal matrices which are initially identity. Transformation applied on each row depends on the previous row and transformation applied on it, thus making computations on every row independent. Rotations on the elements of the bidiagonal matrix are applied on the CPU. The inverse transformations are performed on every row in parallel on the GPU. We combine the power and functionality of using CUDA and the high performance software libraries available with it to exploit the GPU parallelism and achieve high computing performance. This approach can be used for solving other graphics and non-graphics tasks. Our complete Singular Value Decomposition implementation outperforms the MATLAB and Intel \textregistered Math Kernel Library (MKL) LAPACK implementation significantly on the CPU. We show a speedup of upto $60$ over the MATLAB implementation and upto $8$ over the Intel MKL implementation on a Intel Dual Core $2.66$GHz PC with NVIDIA GTX $280$. We are able to compute the Singular Value Decomposition of very large matrices, upto $14$K which is otherwise impossible on the CPU due to memory limitations. The GPUs are limited to single precision numbers, though that is changing with the newer generations. The error due to lower precision was less than $0.001\%$.

We present correct ray tracing of parametric bicubic surfaces on the GPU using Kajiya's approach without subdividing to the GPU. We use a BVH representation of the patches to remove non-intersecting rays. The process starts with a list of rays to be traced in each frame. The BVH is traversed for each ray to enumerate the potential ray-patch intersections. This method forms a univariate 18-degree polynomial for each intersection and finds its roots using the Laguerre's method. The real roots are the parameter values at the intersection points. Polynomial formation and root finding for each ray-patch intersection can be performed independently, making it ideal for processing on the GPU. Our algorithm treats all bounces in a modular fashion. Rays for a subsequent bounce are generated at the end of each bounce and process can be repeated. It finds the exact points of intersection and it is able to provide exact lighting using per pixel normal. We perform the BVH traversal, finding the $v$-root using Laguerre's method, finding the $u$-root using a GCD computation, generating rays for subsequent bounces, and shading on the GPU in parallel. The bulk of the time (as high as $82\%$) is spent on polynomial root finding. Slow double precision computations need to be used for it. Direct ray tracing facilitates shading at each pixel using the true normals. The ray tracing time depends on the number of ray-patch intersections evaluated. Each intersection takes around $3.7$ microseconds on GTX 280. This makes the method faster on future hardware and highly amenable to multi-GPU processing. The time taken is about $466$ milliseconds for primary rays on the Killeroo model, with similar times for subsequent bounces, on a GTX $280$. We achieve a speed up of $340$x on NVIDIA GTX $280$ and $990$x on NVIDIA GTX 480 over the MATLAB implementation of the Kajiya's algorithm on a AMD Dual Core Processor PC. The limiting factors of performance are slow double precision arithmetic and the low amount of shared memory on today's GPUs. The next generation of GPUs have significantly improved both these aspects and we expect to see interactive rates on them. Our algorithm exploits parallelism by dividing the computations into independent tasks.

Year of completion:	2010
Advisor :	P. J. Narayanan

Related Publications

Sheetal Lahabar and P. J. Narayanan - Singular Value Decomposition on GPU using CUDA Proceedings of IEEE International Parallel Distributed Processing Symposium(IPDPS 09), 25-29 May, 2009, Rome, Italy. [PDF]

Downloads

Real-time Terrain Rendering and Processing

Shiben Bhattacharjee (homepage)

Terrains are of great interest in flight simulators, geographic information systems and computer games. In computer graphics, terrain rendering is a special case because of their bulk. They cannot be handled as a single entity like other object models like teapots, cars and crates. Triangulated irregular networks of terrains are typically created by simplifying a dense representation. Such representations are popular in GIS and computational geometry. The recent trend in graphics is to use regular grid representations since they go well with today�s graphics hardware. We explore different representation techniques to render terrains in this thesis. We look into real-time rendering, editing, and physical interaction with external objects on terrains. We also present a representation for efficient rendering of spherical terrains. Apart from rendering terrains realistically, we develop a method to render terrains artistically with painterly abstraction as well.. (more...)

Year of completion:	2010
Advisor :	P. J. Narayanan

Related Publications

Shiben Bhattacharjee, Suryakanth Patidar and P. J. Narayanan - Real-time Rendering and Manipulation of Large Terrains IEEE Sixth Indian Conference on Computer Vision, Graphics & Image Processing (ICVGIP 2008), pp. 551-559, 16-19 Dec,2008, Bhubaneswar, India. [PDF]
Shiben Bhattacharjee and P.J. Narayanan - Real-Time Painterly Rendering of Terrains IEEE Sixth Indian Conference on Computer Vision, Graphics & Image Processing (ICVGIP 2008), pp. 568-575, 16-19 Dec,2008, Bhubaneswar, India. [PDF]
Soumyjit Deb, P.J. Naryanan and Shiben Bhattacharjee - Streaming Terrain Rendering , The 33rd International Conference and Exhibition on Computer Graphics Interactive Techniques Boston Convention and Exhibition Center, Boston, Massachusetts USA, 30 July - 3 August, 2006. [PDF]
Shiben Bhattacharjee and Niharika Adabala - Texture guided Realtime Painterly Rendering of Geometric Models, 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.311-320, 2006. [PDF]
Soumyajit Deb, Shiben Bhattacharjee, Suryakant Patidar and P. J. Narayanan - Real-time Streaming and Rendering of Terrains, 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.276-288, 2006. [PDF]

Shiben Bhattacharjee and P. J. Narayanan - Hexagonal Geometry Clipmaps for Spherical Terrain Rendering, in Sketch, in The 1st ACMSIGGRAPH Conference and Exhibition in Asia (SIGGRAPHAsia), 2008.
TECHNICAL REPORT: Suryakant Patidar, Shiben Bhattacharjee, Jagmohan Singh and P. J. Narayanan, Exploiting the Shader Model 4.0 Architecture, in Technical Report, IIIT Hyderabad,, 2006.

Downloads

Camera Based Palmprint Recognition

Chhaya Methani

There was a time when biometrics was looked upon as the science of the future. It has featured prominently in various science fiction movies as an advanced security measure used to safeguard important documents, buildings etc. With the help of fast paced technological innovation, today, this is not far from reality. Biometrics is increasingly being used for secure authentication of individuals and making its presence felt in our lives. It uses an individual’s physical or behavioral traits to identify them. The decision of which biometric is to be used for a particular application is a complex function of the people’s security needs, ease of use and size of the enterprise. People can now see biometrics based security checks at airports that use iris and hand geometry based authentication and at ATMs using fingerprint and hand veins for authentication.

The stage is now set for the use of biometric recognition in commercially viable civilian applications. Mobile devices that can connect to computational servers and laptops can benefit from this technology in making our systems more secure. Most of these devices now come with attached commodity cameras which can be used for biometric image capture. An easy-to-capture biometric modality that could work well with a commodity camera is palmprint. It has coarse lines which can be easily detected using a low resolution camera and it is easy to present due to the free mobility of our palm. On most surveys, hand as a biometric modality rates high on user acceptance. For these reasons, palmprint would be an ideal choice for recognition using commodity cameras.

Consumer devices, however, employ image capture in an open environment as opposed to the controlled environment preferred for biometric data capture by state-of-art designs. For palmprint image capture, a semi-closed environment is created using a box-like setup having an illumination source on top, resulting in clean images with pre-fixed pose and illumination settings. Inspite of the resulting high accuracy, it is not practically feasible to provide a semi closed setup with a laptop or a mobile device where people are using their systems everyday and want fast access on a regular basis. The unrestricted imaging associated with mobile cameras results in huge intra class variations of palm. The performance of existing techniques for palmprint authentication fall considerably, when the camera is not aligned with the surface of the palm. The problems arise primarily due to variations in appearance introduced due to varying pose, but is compounded by specularity of the skin and blur due to motion and focus. Hence, we need novel recognition methods for images captured in an unconstrained environment with the hand presented to the system in an unsupervised manner.

In this thesis, we propose the design of a biometric system used for unconstrained and unsupervised camera based palmprint recognition system. We present a new pose transformation algorithm that can identify individuals even after seeing the hand being presented in a pose different than that stored in the database. The method can robustly estimate and correct variations in pose, and compute a similarity measure between the corrected test image and a reference image. The method is able to correct for pose variation even in degraded images having variable illumination.

However, another challenge surfaces during matching of the palm wherein bad line visibility worsens recognition performance. Hence, we propose another algorithm that separates out the original palm lines from the false line like impressions created due to illumination, contrast variations and loose skin. Even minor changes in pose of the palm can induce significant changes in the visibility of the lines. We turn this property to our advantage by capturing a short video, where the natural palm motion induces minor pose variations, providing additional texture information. This is important for improved matching of images. We propose a method to register multiple frames of the video without requiring correspondence, while being efficient.

Since, this is the first attempt at creating an unconstrained palmprint recognition system, we created two in-house databases to model the pose and illumination variations related to the palm image capture process used by us. The first database contains images of 100 users, having 5 images each having variable poses. The second database captures 6 videos each for 100 subjects captured using a regular web camera. Both the datasets have been captured under natural illumination conditions. Experimental results on the first dataset using the pose correction algorithm shows a reduction in Equal Error Rate from 22.4% to 8.7%. Through an independent experiment performed on the video palm database, we observed that the use of multiple frames reduces the error rate from 12.75% to 4:7%. We also propose a method for detection of poor quality samples due to specularities and motion blur, which further reduces the EER to 1.8%.

Year of completion:	August 2010
Advisor :	Anoop M. Namboodiri

Related Publications

Chhaya Methani and Anoop M. Namboodiri - Pose Invariant Palmprint Recognition Proceedings of the 3rd International Conference on Biometrics (ICB 2009), pp. 577-586, June . 2-5, 2009, Alghero, Italy. [PDF]

Chhaya Methani, Anoop Namboodiri - Video Based Palmprint Recognition, Proceedings of the 20th International Conference on pattern Recognition (ICPR 2010), August, 2010, Istanbul, Turkey [PDF]

Demonstrations

Presentation at ICB (here...)
Poster hat ICPR (here...)

Downloads

An Approximate Nearest Neighbor Retrieval Scheme for Computationally Intensive Distance Measures

Pratyush Bhatt (homepage)

Nearest neighbor retrieval can be defined as the task of finding the objects that are most similar to a query from a given a database of objects. It find its application in areas ranging from medical domain, financial sector, computer vision, computational sciences, computational geometry, information retrieval, etc. With the expansion of internet, the amount of digitized data is increasing by leaps and bounds. Retrieval of nearest nearest neighbors accurately and efficiently becomes challenging in such a scenario as the database contain a large number of objects. The problem gets worsen when the underlying distance measure used to compute [dis]similarity is computationally expensive. In such a scenario, sequential scan of data would take a lot of time which is the biggest problem for any online retrieval system. For example in biometric authentication systems, a particular person biometric template is compared against all the registered samples in a database to identify the person. This process can be extremely time consuming in large databases even if the matching algorithm is extremely fast. For example, to do background check of a person who is crossing the border using the complete IAFIS,(a biometric person identification system at the U.S. border crossings), requires around 55 million comparisons. Even with the state of the art matching algorithms and computing facilities, this would take close to 10 minutes, which is not practical considering the millions of people who cross the border every month. Even for criminal investigations, it is desirable to get a quick and approximate search done immediately rather than the typical turn-around time of a few days for a search. This thesis proposes a novel method for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed technique is domain-independent, and can be applied in arbitrary spaces, including non- Euclidean and non-metric spaces. The main contributions of our work are :

A representation scheme for objects in a dataset that allows for fast retrieval of approximate nearest neighbors in non-euclidean space. The approach named Hierarchical LocalMaps (HLM),make use of manifold learning techniques to compute linear approximation of local neighborhoods.
Search mechanism combined with filter and refine approach is proposed that minimizes the number of exact distance computations for computationally expensive distance measure.
Study performance of our scheme on biometric data and study the parameters affecting its performance.

Results of k-nearest neighbor retrieval as well as classification results on UNIPEN dataset shows the advantages of using HLM over state-of-the-art approximate nearest neighbor retrieval algorithms. Classification result on CASIA iris dataset by using average gabor response for a block as the feature vector along with Euclidean distance as the soft biometric measure in conjugation with Daugman feature vector and hamming distance as the hard biometric shows the advantage of using a softer metric over a hard metric for indexing.

Year of completion:	2010
Advisor :	Anoop M. Namboodiri

Related Publications

Pratyush Bhatt and Anoop Namboodiri - Hierarchical Local Maps for Robust Approximate Nearest Neighbour Computation Proceedings of the 7th International Conference on Advances in Pattern Recognition (ICAPR 2009), Feb. 4-6, 2009, Kolkotta, India. [PDF]

Downloads

Scene Interpretation in Images and Videos.

Chetan Jakkoju (homepage)

Scene interpretation is a fundamental task in both computer vision and robotic systems. We deal with two important aspects of scene interpretation, they are scene reconstruction and scene recognition. Scene reconstruction is determining 3D positions of world points and retrieving camera poses from images. It has several applications such as virtual building editing in computer aided architecture, video augmentation in film industry and planning and navigation in mobile robotics. Among several approaches to modeling the scene, we deal with piecewise planar modeling due to several advantages: Man-made environments are often piece-wise-planar, planar modeling has compact representation and this can be easily modified. We propose a convex optimization based, approach for piecewise planar reconstruction. We show that the task of reconstructing a piece-wise planar environment can be set in an L8 based Homographic framework that iteratively computes scene plane and camera pose parameters. Instead of image points, the algorithm optimizes over inter-image homographies. The resultant objective function is minimized using Second Order Cone Programming algorithms. Apart from showing the convergence of the algorithm, we also empirically verify its robustness to error in initialization through various experiments on synthetic and real data. We intend this algorithm to be in between initialization approaches like decomposition methods and iterative non-linear minimization methods like Bundle Adjustment.

Scene recognition in robotics, specifically terrain scene recognition is one of the fundamental tasks of autonomous navigation. Navigable terrains are examples of planar scenes. The goal of terrain recognition is to recognize various terrains that occur in urban and rural environments in an automated fashion. It has applications in various domains such as advanced driver assistance systems, remote sensing, etc. Various sensing modalities such as ladars, lasers, accelerometers, stereo cameras, omni-directional cameras or combination of them are used in literature. This thesis attacks the problem of scene interpretation using a single camera. This investigation is especially crucial since cameras are relatively low in cost, consume low power, light weight and have the potential to provide very rich information about the environment. Recent advances in computer vision, machine learning and improvements in hardware capabilities have greatly increased the scope of monocular camera, even in unstructured and real world environments. In this thesis, we start with empirical study of promising color, texture and their combination with classifiers such as Support Vector Machines (SVM) and Random Forests. We present comparison across features and classifiers. Then we present a monocular camera based terrain recognition scheme called Partition based classifier. The uniqueness of the proposed scheme is that it inherently incorporates spatial smoothness while segmenting an image, without the requirement of any additional post-processing. The algorithm is fast because it is build on top of a Random Forest classifier. The efficacy of the proposed solution can be seen as we reach low error rates on both our dataset and other publicly available datasets.

Further partition classifier is extended to be online and adaptive. The new scheme consists of two underlying classifiers. One of which is learnt over bootstrapped or offline dataset, the second is another classifier that adapts to changes on the fly. Posterior probabilities of both the static and online classifiers are fused to assign the eventual label for the online image data. The online classifier learns at frequent intervals of time through a sparse and stable set of tracked patches, which makes it lightweight and real-time friendly. The learning which is acuted at frequent intervals during the sojourn significantly improves the performance of the classifier vis-a-vis a scheme that only uses the classifier learnt offline. The method finds immediate applications for outdoor autonomous driving where the classifier needs to be updated frequently based on what shows up recently on the terrain and without largely deviating from those learnt offline.

Year of completion:	July 2012
Advisor :	Dr. C. V. Jawahar and Dr. Madhava Krishna

Related Publications

Chetan J, Madhava Krishna and C. V. Jawahar - Fast and Spatially-smooth Terrain Classification using Monocular Camera Proceedings of 20th International Conference on Pattern Recognition (ICPR'10),23-26 Aug. 2010, Istanbul, Turkey. [PDF]

Chetan J., Madhava Krishna and C. V. Jawahar - An Adaptive Outdoor Terrain Classification Methodology using Monocular Camera In proceedings of International Conference on Intelligent Robots and Systems. (IROS 2010 )
Visesh Chari, Anil Nelakanti, Chetan Jakkoju and C. V. Jawahar - Piecewise Planar Reconstruction using Convex Optimization In proceedings of Asian Conference on Computer Vision (ACCV 2009).

Exploiting the Graphics Hardware to solve two compute intensive problems: Singular Value Decomposition and Ray Tracing Parametric Patches

Related Publications

Downloads

Real-time Terrain Rendering and Processing

Related Publications

Downloads

Camera Based Palmprint Recognition

Related Publications

Demonstrations

Downloads

An Approximate Nearest Neighbor Retrieval Scheme for Computationally Intensive Distance Measures

Related Publications

Downloads

Scene Interpretation in Images and Videos.

Related Publications

Downloads

More Articles …