CVIT Home CVIT Home
  • Home
  • People
    • Faculty
    • Staff
    • PhD Students
    • MS Students
    • Alumni
    • Post-doctoral
    • Honours Student
  • Research
    • Publications
    • Thesis
    • Projects
    • Resources
  • Events
    • Talks and Visits
    • Major Events
    • Visitors
    • Summer Schools
  • Gallery
  • News & Updates
    • News
    • Blog
    • Newsletter
    • Banners
  • Contact Us
  • Login

High Quality Rendering of Large Point-based Surfaces.


Naveen Kumar Reddy Bolla (homepage)

Points-based models have gained popularity in recent years. The most attractive feature of the point-based primitives is its simplicity. No information about connectivity or topology of the surface is stored explicitly. Adaptive modification and dynamic sampling of the surface do not suffer from complexity and robustness like triangle mesh. The level of detail is simple, fast and almost continuous. However, points based methods suffer from surface normal discontinuities in representation. Shading is also not independent from sampling rate. Point based models can be visualized by rendering the points directly. Each sample in the representation has certain attributes (normal, color, material properties) for rendering and shading them as a continuous surface. Typically, a point sample also has a radius to define an area around it. Such samples are called surfels and approximate the shape of the surface linearly in their neighborhood. Since linear splats can only be flat shaded, such representations require a large number of samples for a good shading quality. This slows the rendering due to increase in rendering computation and related memory bus activity.

Point-based rendering suffer from the limited resolution of the fixed number of samples representing the model. At some distance, the screen space resolution is high relative to the point samples, which causes under-sampling. A better way of rendering a model is to re-sample the surface during the rendering at the desired resolution in object space, guaranteeing a sampling density sufficient for image resolution. Output sensitive sampling samples objects at a resolution that matches the expected resolution of the output image. This is crucial for hole-free point-based rendering. Many technical issues related to point-based graphics boil down to reconstruction and re-sampling. A point based representation should be as small as possible while conveying the shape well.

In the first part of this thesis, we present a compact representation for point based models using non-linear surface elements. In the second part, we present a method for fast ray casting of a dynamic surface defined by large number of attributed points called Metaballs.

Algebraic Splats: Higher-order approximations of the local neighborhood have the potential to represent the shape using fewer primitives, simultaneously achieving higher rendering speeds and quality. In this thesis, we present algebraic splats, which are low-order polynomials that approximate the local neighborhood of a pointset. We adapt the Moving Least Squares (MLS) approximation to generate algebraic splat representations for a given point set. Quadratic and cubic splats as basic rendering primitive are able to approximate point-based models using 10% or fewer points than linear splats, for equivalent visual quality. Though rendering a non-linear patch is slower compared to a linear splat, the overall speed of rendering of an object could be faster. The approximation error can also be less using a small number of higher order primitives. We represent a given point set with a user-specified small number of algebraic splats with optimal rendering quality. This is done by decimating the point set and jointly approximating each using a local algebraic surface based on the Moving Least Squares (MLS) approximation.

david.mr.points w100 david.mr.splats w100

Our rendering provides smooth surfaces with normals everywhere. We can render polynomials directly on today's GPUs using ray-tracing because of the semi-implicit nature of the splats in the local reference domain. The major contributions of this work are: (i) A compact and lightweight representation for point geometry using nonlinear surface elements, called algebraic splats; (ii) A multi-resolution representation which provides continuous Level of Detail; and (iii) A ray-tracing approach to render the algebraic splats on the GPU. The method includes an adaptive anti-aliasing step to obtain smooth silhouettes. The David 2mm model can be represented using about 30K (or 0.8% of original) algebraic splats with little or low reduction in visual quality. We can raycast the David model with adaptive antialiasing (3X grid) at 130-150 fps and 80-90 fps with shadowing. Our method is faster by a factor of 10 on most models and require about 90% less diskspace in comparision to linear splats.

Metaballs: Since the available computational power is steadily growing, more and more science areas rely on simulations of ever-growing problem sizes producing huge amount of data to be analyzed. Simulation and experimental measurement in life sciences, physics, chemistry, materials, and thermodynamics yield large and often also time-dependent datasets. Interactive visualization is the key service that facilitates the analysis of such datasets and thus enables the researchers in those fields to quickly assess and compare the results of a simulation or measurement, verify and improve their models, and in so doing coming ever closer to understanding how dynamic processes work.

Metaballs (also known as blobs) are popular particle-based implicit surface representations. A set of them can represent smooth deformable, free-form shapes represented as the iso-surface of a scalar field. Current methods can handle only a moderate number of metaballs at interactive rates on commodity GPUs. In this thesis, we present a method to handle a million or more dynamic metaballs at interactive rates. We use a perspective grid as the acceleration structure, which is built in each frame on the GPU using fast primitives. Perspective grids provide perfect coherence to primary rays. We ray trace each grid cell at a time using an adaptation of the marching points methods to metaballs. This method extracts high performance from the SIMD GPUs of today. We show interactive handling of upto a million balls. Our method is also able to handle secondary rays, though at much reduced speeds due to the lack of coherence but is suitable for live visualization of particle-based simulations. The average build time for 1 million primitives is about 38 milliseconds and we can render a million metaballs in about 94 milliseconds. Performance of Secondary rays is not fast due to the lack of coherence, with a rendering time of 150 ms for 1K primitives. (more...)

 

Year of completion:  July 2010
 Advisor : P. J. Narayanan

Related Publications

  • Naveen Kumar Bolla and P. J. Narayanan - Algebraic Splats Representation for Point Based Models IEEE Sixth Indian Conference on Computer Vision, Graphics & Image Processing (ICVGIP 2008), pp. 71-78, 16-19 Dec,2008, Bhubaneswar, India. [PDF]

  • Naveen Kumar Bolla, P. J. Narayanan, Algebraic Splats for High-Quality Rendering of Points (Submitted to Graphical Models, Under Review).
  • Naveen Kumar Bolla, P. J. Narayanan, Real-time Ray-tracing of Million Metaballs. (Submitted to EGSR)

Downloads

thesis

ppt

Efficient Privacy Preserving Protocols for Visual Computation


Maneesh Upmanyu (homepage)

The rapid expansion of the Internet is receiving a great deal of attention world-wide. The technological developments and the increase in online communications have played a vital role in revolutionalizing the Information Age. There is a tremendous growth of online applications to manipulate the user's personal data, which has resulted in the widespread availability of the user's personal data in the digital form. This raises the issue of privacy protection against potential misuse of this data by legitimate service providers or intruders. Without proper countermeasures to thwart the attacks, security problems become a major threat and a serious impediment to further development of business applications on communication systems.

Many of the current solutions provides information security by assuming a level of trust among the parties. The leakage of the critical data to third parties is prevented by applying cryptographic primitives as a secure layer over the standard algorithm. On the other hand, privacy preservation computation is more closely related to Secure Multiparty Computation (SMC). SMC enables two parties; one with the function f() and the other with the input x; to compute f(x) without revealing them to each other. However, the solutions based on the general protocol of SMC requires enormous computational and communication overhead, thus limiting the practical deployment of the secure algorithms.

In this dissertation, we focus on development of `highly-secure', `comunication and computationally efficient' algorithms to problems with `immediate impact' in the domain of computer vision and related areas. Security issues in computer vision primarily originates from the storage, distribution and processing of the personal data, whereas privacy concerns with tracking down of the user's activity. The primary challenge is in providing the ability to perform generic computations on the visual data, while ensuring provable security. In this thesis, we propose lightweight encryption's for visual data, such that the server should be able to carry out the computations on the encrypted data and also store the stream if required, without being able to decipher the actual contents of the image. Moreover, the protocols are designed such that the interaction and the data communication among the servers is kept to a minimum.

It has been proven before that the best way to achieve secure computation on a remote server is by using the cryptographic protocol of SMC. Thus, a method that provides provable security, while allowing efficient computations without incurring either significant computation or communication overhead has remained elusive till now. We show that, for designing secure visual algorithms one can exploit certain properties such as scalability, limited range etc, inherent to visual data to break this impenetrable barrier. We study and propose secure solutions for applications such as Blind Authentication, i.e. blindly authenticating a remote-user using his biometric. Subsequently, we present a highly secure framework for carrying out visual surveillance on random looking video streams at remote servers. We then propose a simple and an efficient cloud-computing based solution using the paradigm of secret sharing to privately cluster an arbitrary partitioned data among N users. The solutions we propose are accurate, efficient and scalable and has potential to extend over to even more diverse applications.

In our first work, blind authentication, we propose private biometric authentication protocol which is extreamly secure under a variety of attacks and can be used with a wide variety of biometric traits. The protocol is blind in the sense that it reveals only the identity, and no additional information about the user or the biometric to the authenticating server or vice-versa. The primary advantage of the proposed approach is the ability to achieve classification of a strongly encrypted feature vector using generic classifiers such as Neural Networks and SVMs. Our proposed solution addresses the concerns of user's privacy, template protection, and trust issues. And captures the advantages of biometric authentication as well as the security of public key cryptography.

We then present an efficient, practical and highly secure framework for implementing visual surveillance on untrusted remote computers. To achieve this we demonstrate that the properties of visual data can be exploited to break the bottleneck of computational and communication overheads. The issues in practical implementation of certain algorithms including change detection, optical flow, and face detection are addressed. Our method enables distributed secure processing and storage, while retaining the ability to reconstruct the original data in case of a legal requirement. Such an architecture provides us both security as well as computation and communication efficiency.

We next extend our proposed paradigm to achieve the ability to do un-supervised learning using K-means in the encrypted domain. Traditional approaches uses primitives such as SMC or PKC, thus compromising the efficiency of the solutions and in return provide very high level of privacy which is usually an overkill in practice. We use the paradigm of secret sharing , which allows the data to be divided into multiple shares and processed separately at different servers. Our method shows that privacy need not be always at the cost of efficiency. Our proposed solution is not only computationally efficient but also secure independent of whether or not P ≠ NP.

 

Year of completion:  June 2010
 Advisor : C. V. Jawahar, Anoop M. Namboodiri, Dr. Kannan Srinathan

Related Publications

  • Maneesh Upmanyu, Anoop M. Namboodiri, Kannan Srinathan and C. V. Jawahar - Blind Authentication: A Secure Crypto-Biometric Verification Protocol IEEE Transactions on Information Forensics and Security, Vol. 5(2), pp. 255-268 (2010). [PDF]

  • Maneesh Upmanyu, Anoop M. Namboodiri, K. Srinathan and C. V. Jawahar - Efficient Biometric Verification in Encrypted Domain Proceedings of the 3rd International Conference on Biometrics (ICB 2009), pp. 899-908, June . 2-5, 2009, Alghero, Italy. [PDF]

  • Maneesh Upmanyu, Anoop M. Namboodiri, K. Srinathan and C.V. Jawahar - Efficient Privacy Preserving Video Surveillance Poceedings of the 12th International Conference on Computer Vision (ICCV), 2009, Kyoto, Japan [PDF]

  • Maneesh Upmanyu, Anoop M. Namboodiri, K. Srinathan and C.V. Jawahar - Efficient Privacy Preserving K-Means Clustering: ; Pacific Asia Workshop on Intelligence and Security Informatics (PAISI) 2010 (Best Paper Award) (Download).

Downloads

thesis

ppt

Speeding Up MRF Optimization Using Graph Cuts For Computer Vision


Vibhav Vineet (homepage)

Many problems in computer vision are mapped to minimization of an energy or a cost function de- fined over a discrete Markov Random Field (MRF) or Conditional Random Field (CRF). The primary reason for the popularity has been the successes of the graph cuts based methods in solving various labeling problems, especially, image segmentation, stereo correspondence and image denoising. Subse- quently, computer vision researchers focused on improving the computational efficiency of graph cuts based methods. In this dissertation, we present two methods to speed up the graph cuts to solve different MRF optimization problems optimally.

In the first half of the dissertation, we present efficient methods to speed up MRF optimization using graph cuts on the GPU. In particular, we present our optimal implementation of the push-relabel methods to solve various labeling problems and show how we have efficiently used different facilities available on the GPU. We also present the incremental α-expansion algorithm to solve multilabel MRFs on the GPU and show how shifting the energy function computation and graph construction to the GPU speeds up the overall computation. We compare the efficiency of our approaches on the problems of image segmentation, image restoration, photomontage and disparity computation on the GPU.

The second half of the dissertation deals with the minimization of energy functions defined over large images involving millions of pixels and large number of labels. The computation and memory costs of the current vision algorithms increase at a very fast rate with the increase in the number of the variables in the MRF. This makes them inapplicable to be used for problems involving large images. We present Pyramid Reparameterization scheme to improve the computation efficiency without sacrificing global optimality for bilabel segmentation. We also formulate the multilabel problems esp. stereo correspondence and image denoising on multiresolution framework using Pyramid Reparameterization. This formulation reduces both the computation and memory costs. We model the optimization problem on the pyramidal framework to dynamically update and initialize the solution at the current level with the solution from the previous level. The results of this method are compared with those obtained using conventional methods which solve graph cuts on the largest image.

 

Year of completion:  June 2010
 Advisor : P. J. Narayanan

Related Publications

  • Vibhav Vineet and P. J. Narayanan - Solving Multilabel MRFs using Incremental Alfa Expansion on the GPUs Proceedings of the Ninth Asian Conference on Computer Vision(ACCV 09), 23-27 September, 2009, Xi'an, China. [PDF]

  • Vibhav Vineet, Pawan Harish, Suryakanth Patidar and P. J. Narayanan - Fast Minimum Spanning Tree for Large Graphs on the GPU Proceedings of High-Performance Graphics (HPG 09), 1-3 August, 2009, Louisiana, USA. [PDF]

  • Vibhav Vineet and P. J. Narayanan - CUDA Cuts: Fast Graph Cuts on the GPU Proceedings of CVPR Workshop on Visual Computer Vision on GPUs, 23-28th june, Anchorage, Alaska, USA. IEEE Computer Society 2008 [PDF]

  • Fast Graph Cuts on the GPU: P. J. Narayanan, Vibhav Vineet and Timo Stitch - Fast Graph Cuts on the GPU. GPU Computing Gems (GCG), Volume 1 Dec. 2010(Book Chapter).
  • Pawan Harish, Vibhav Vineet and P. J. Narayanan - Large Graph Algorithms for Massively Multi-threaded Architectures, IIIT Tech Report, IIIT/TR/2009/74.
  • Vibhav Vineet, Pawan Harish, Suryakant Patidar and P. J. Narayanan - Fast Minimum Spanning Tree for Large Graphs on the GPU: GPU Computing Gems (GCG). (Under Submission)

Downloads

thesis

ppt

Towards Efficient and Scalable Visual Processing in Images and Videos


Mihir Jain

The amount of multimedia content produced and made available on Internet and in professional and personal collections is constantly growing. Equally increasing are the needs in terms of efficient and effective ways to manage it. This has led to a great amount of research into content based retrieval and visual recognition. In this thesis, we focus on efficient visual content analysis in images and videos. Efficiency has emerged as one of the key issues with increase in quantity of data. Understanding of a visual content has several aspects associated with it. One can concentrate on recognizing the inherent characteristics of image (independent or from a video) like objects, scene and context. Searching for a sequence of images based on similarity or characterizing the video based on its visual content could be some other aspects.

We investigate three different approaches for visual content analysis in this thesis. In the first, we target the detection and classification of different object and scene classes in images and videos. The task of classification is to predict the presence of an object or a specific scene of interest in the test image. Object detection further involves localizing each instance of the object present. We do extensive experimentation over very large and challenging datasets with large number of object and scene categories in it. Our detection as well as classification are based on Random Forest combined with combinations of different visual features describing shape, appearance and color. We exploited the computational efficiency in both training and testing, and other properties of Random Forest for detection and classification. We also proposed enhancements over our baseline model of object detector. Our main contribution here is that we achieve fast object detection with accuracy comparable to the state of art.

The second approach is based on processing continuous stream of videos to detect video segments of interest. Our method is example-based where visual content to be detected or filtered is characterized by a set of examples available apriori. We approach the problem of video processing in a manner complimentary to that of video retrieval. We begin with a set of examples (used as queries in retrieval) and index them in the database. The larger video collection, which needs to be processed, is unseen during the off-line indexing phase. We propose an architecture based on trie data structure and bag of words model to simultaneously match multiple example videos in the database with the input large video stream. We demonstrate the application of our architecture for the task of content based copy detection (CBCD).

In our third and final approach we apply pattern mining algorithms in videos to characterize the visual content. They are derived out of data mining schemes for efficient analysis of the content in video databases. Two different video mining schemes are employed; both aimed at detecting frequent and representative patterns. For one of our mining approaches, we use an efficient frequent pattern mining algorithm over a quantized feature space. Our second approach uses random forest to represent video data as sequences, and mine the frequent sequences. We experiment on broadcast news videos to detect what we define as video stop-words and extract the contents which are more important such as breaking news. We are also able to characterize the movie videos by automatically identifying the characteristic scenes and main actors of the movie.

The ideas proposed in the thesis have been implemented and validated with extensive experimental results. We demonstrate the accuracy, efficiency and scalability of all the three approaches over large and standrad datasets like VOC PASCAL, TRECVID, MUSCLE-VCD as well as movie and news datasets.

 

Year of completion:  2010
 Advisor : C. V. Jawahar

Related Publications

  • Mihir Jain, Sreekanth Vempati, Chandrika Pulla and C.V. Jawahar - Example Based Video Filters Proceedings of the 8th International Conference on Image and Video Retrieval (CIVR 2009), July . 8-10, 2009, Santorini, Greece. [PDF]

  • Sreekanth Vempati, Mihir Jain, Omkar Parkhi and C. V. Jawahar - Andrea Vedaldi, Marcin Marszalek and Andrew Zisserman In Proceedings of the TREC Video Retrieval (TRECVID) Workshop organized by NIST, Gaithersburg, USA, 2009.
  • James Philbin, Manuel Marin-Jimenez, Siddharth Srinivasan and Andrew Zisserman, Mihir Jain, Sreekanth Vempati, Pramod Sankar and C. V. Jawahar, In Proceedings of the TREC Video Retrieval (TRECVID) Workshop organized by NIST, Gaithersburg, USA, 2008

Downloads

thesis

ppt

Exploiting the Graphics Hardware to solve two compute intensive problems: Singular Value Decomposition and Ray Tracing Parametric Patches


Sheetal Lahabar(homepage)

The rapid increase in the performance of graphics hardware have made the GPU a strong candidate for performing many compute intensive tasks, especially many data-parallel tasks. Off-the-shelf GPUs are rated at upwards of $2$ TFLOPs of peak power today. The high compute power of the modern Graphics Processing Units (GPUs) has been exploited in a wide variety of domain other than graphics. They have proven to be powerful computing work horses in various fields. We leverage the processing power of GPUs to solve two highly compute intensive problems: Singular Value Decomposition and Ray Tracing Parametric Patches. Singular Value Decomposition is an important matrix factorization technique used to decompose a matrix into several component matrices. It is used in a wide variety of application related to signal processing, scientific computing, etc. However, little work has been done to implement Singular Value Decomposition on parallel architectures. Direct ray tracing of parametric patches provides high geometric precision resulting in less artifacts and results in better rendering using true normals for lighting at every pixel. Researchers have developed algorithms to ray tracing parametric surfaces directly, but these algorithms are either computationally expensive or have other disadvantages. Both these problems are both computationally demanding and amenable to massively parallel implementations. Singular Value Decomposition computations requires factorizing a dense matrix into component matrices. Finding the intersection of a ray with a B\'{e}zier patch requires finding the roots of a $18$ degree polynomial. They require enormous number of floating point operations. The advent of GPUs with support for higher precision and their ever-growing amount of parallel horsepower makes them a tempting resource for such numerical computations. We propose parallel implementations for these problems on a GPU. They outperform the best CPU implementations available significantly.

We implement the Singular Value Decomposition of a dense matrix on GPUs using the two step Golub Reinsch algorithm. In the first step, bidiagonalization decomposes a given dense matrix into bidiagonal matrix and component orthogonal matrix using a series of Householder transformations. In the next step, diagonalization is performed by applying the implicitly shifted QR algorithm which takes $O(n)$ iterations. It decomposes the bidiagonal matrix into a diagonal matrix and orthogonal matrices. Each householder transformation reduces a row-column pair. We divide the number of householder transformations required into $n/L$ blocks. $L$ householder transformations are applied in parallel by mapping them to CUBLAS operations to derive maximum performance. We thus, exploit the underlying hardware to the maximum. We use a hybrid implementation for the diagonalization step that splits the computations between the CPU and the GPU. Every iteration of the QR algorithm applies Givens rotations on the elements of the bidiagonal matrix and the corresponding inverse rotations are applied on the rows of the orthogonal matrices which are initially identity. Transformation applied on each row depends on the previous row and transformation applied on it, thus making computations on every row independent. Rotations on the elements of the bidiagonal matrix are applied on the CPU. The inverse transformations are performed on every row in parallel on the GPU. We combine the power and functionality of using CUDA and the high performance software libraries available with it to exploit the GPU parallelism and achieve high computing performance. This approach can be used for solving other graphics and non-graphics tasks. Our complete Singular Value Decomposition implementation outperforms the MATLAB and Intel \textregistered Math Kernel Library (MKL) LAPACK implementation significantly on the CPU. We show a speedup of upto $60$ over the MATLAB implementation and upto $8$ over the Intel MKL implementation on a Intel Dual Core $2.66$GHz PC with NVIDIA GTX $280$. We are able to compute the Singular Value Decomposition of very large matrices, upto $14$K which is otherwise impossible on the CPU due to memory limitations. The GPUs are limited to single precision numbers, though that is changing with the newer generations. The error due to lower precision was less than $0.001\%$.

We present correct ray tracing of parametric bicubic surfaces on the GPU using Kajiya's approach without subdividing to the GPU. We use a BVH representation of the patches to remove non-intersecting rays. The process starts with a list of rays to be traced in each frame. The BVH is traversed for each ray to enumerate the potential ray-patch intersections. This method forms a univariate 18-degree polynomial for each intersection and finds its roots using the Laguerre's method. The real roots are the parameter values at the intersection points. Polynomial formation and root finding for each ray-patch intersection can be performed independently, making it ideal for processing on the GPU. Our algorithm treats all bounces in a modular fashion. Rays for a subsequent bounce are generated at the end of each bounce and process can be repeated. It finds the exact points of intersection and it is able to provide exact lighting using per pixel normal. We perform the BVH traversal, finding the $v$-root using Laguerre's method, finding the $u$-root using a GCD computation, generating rays for subsequent bounces, and shading on the GPU in parallel. The bulk of the time (as high as $82\%$) is spent on polynomial root finding. Slow double precision computations need to be used for it. Direct ray tracing facilitates shading at each pixel using the true normals. The ray tracing time depends on the number of ray-patch intersections evaluated. Each intersection takes around $3.7$ microseconds on GTX 280. This makes the method faster on future hardware and highly amenable to multi-GPU processing. The time taken is about $466$ milliseconds for primary rays on the Killeroo model, with similar times for subsequent bounces, on a GTX $280$. We achieve a speed up of $340$x on NVIDIA GTX $280$ and $990$x on NVIDIA GTX 480 over the MATLAB implementation of the Kajiya's algorithm on a AMD Dual Core Processor PC. The limiting factors of performance are slow double precision arithmetic and the low amount of shared memory on today's GPUs. The next generation of GPUs have significantly improved both these aspects and we expect to see interactive rates on them. Our algorithm exploits parallelism by dividing the computations into independent tasks.

 

Year of completion:  2010
 Advisor : P. J. Narayanan

Related Publications

  • Sheetal Lahabar and P. J. Narayanan - Singular Value Decomposition on GPU using CUDA Proceedings of IEEE International Parallel Distributed Processing Symposium(IPDPS 09), 25-29 May, 2009, Rome, Italy. [PDF]


Downloads

thesis

ppt

More Articles …

  1. Real-time Terrain Rendering and Processing
  2. Camera Based Palmprint Recognition
  3. An Approximate Nearest Neighbor Retrieval Scheme for Computationally Intensive Distance Measures
  4. Scene Interpretation in Images and Videos.
  • Start
  • Prev
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • Next
  • End
  1. You are here:  
  2. Home
  3. Research
  4. Thesis
  5. Thesis Students
Bootstrap is a front-end framework of Twitter, Inc. Code licensed under MIT License. Font Awesome font licensed under SIL OFL 1.1.