Parallel Computing using CPU and GPU


Commodity graphics hardware has become a cost-effective parallel platform to solve many general problems; owing to its low cost to performance ratio (under $0.5 per GFLOP). GPUs are optimized for graphics operations and their programming model is highly restrictive. All algorithms are disguised as graphics rendering passes with the programmable shaders interpreting the data. This was the situation until the latest model of GPUs following the Shader Model 4.0 were released late in 2006. These GPUs follow a unified architecture and can be used in more flexible ways than their predecessors. Starting from the G80 series of GPUs, Nvidia offers an alternate programming model called Compute Unified Device Architecture (CUDA) to the underlying parallel processor. CUDA is highly suited for general purpose programming on the GPUs and provides a model close to the PRAM model. There has been a tremendous amount of growth in both the programmability and efficiency of GPUs over the years, leading to many GPU based solutions for general purpose problems.

Projects involving GPU's and CUDA

1. String Sort on GPUs StringSort

String sorting or variable-length key sorting has lagged in performance on the GPU even as the fixed-length key sorting has improved dramatically. Radix sorting is the fastest on the GPUs. In this work, we present a fast and efficient string sort on the GPU that is built on the available radix sort. Over 70% of the string sort time is spent on standard Thrust primitives for sort, scatter and scan. This provides high performance along with high adaptability to future GPUs. Our string sort algorithm is 10x faster compared to current GPU methods.

2. Mixed-Resolution Patch Matching

PatchMatchingMatching patches of a source image with patches of itself or a target image is a first step for many operations. Finding the optimum nearest-neighbors of each patch using a global search of the image is expensive. Optimality is often sacrificed for speed as a result. In this work, we developed the Mixed-Resolution Patch-Matching (MRPM) algorithm that uses a pyramid representation to perform fast global search. We compare mixed-resolution patches at coarser pyramid levels to alleviate the effects of smoothing. We store more matches at coarser resolutions to ensure wider search ranges and better accuracy at finer levels. Our method achieves near optimality in terms of exhaustive search. Our simple approach enables fast parallel implementations on the GPU, yielding upto 70x speedup compared to other iterative approaches.

3. Scalable K-Means Clustering

ScalableClusteringK-Means is a popular clustering algorithm with wide applications in Computer Vision, Data mining, Data Visualization, etc. Clustering large numbers of high-dimensional vectors is very computation intensive. In this work, we present the design and implementation of the K-Means clustering algorithm on the modern GPU. A load balanced multi-node, multi-GPU implementation which can handle up to 6 million, 128-dimensional vectors was also developed. Our implementation scales linearly or near-linearly with different problem parameters. We achieve up to 2 times increase in speed compared to the best GPU implementation for K-Means on a single GPU.

4. Error Diffusion Dithering

FsdHybridMany image filtering operations provide ample parallelism, but progressive non-linear processing of images is among the hardest to parallelize due to long, sequential, and non-linear data dependency. A typical example of such an operation is error diffusion dithering, exemplified by the Floyd-Steinberg algorithm. In this work, we present its parallelization on multicore CPUs using a block-based approach and on the GPU using a pixel-based approach. We also develop a hybrid approach in which the CPU and the GPU operate in parallel during the computation. Our implementation can dither an 8K x 8K image on an off-the-shelf laptop with Nvidia 8600M GPU in about 400 milliseconds when the sequential implementation on its CPU took about 4 seconds.

5. Graph Algorithms on CUDA

GraphCutsWe have developed several basic graph algorithms on the CUDA architecture including BFS, Single Source Shortest Path(SSSP), All-Pairs Shortest Path(APSP), and Minimum Spanning Tree computation for large graphs consisting of millions of vertices and edges. We show results on random, scale free and almost linear graphs. Our approaches are 10-50 times faster than their CPU counterparts, on random graphs with an average degree of 6 per vertex.

We have also developed efficient parallel implementations for performing Graph-Cuts on grid graphs in CUDA. Our original work, titled CUDA-Cuts is based on the push-relabel operation for computing the max-flow and hence the min-cut of a graph. We have also developed a multi-resolution framework for max-flow computation which depends on shrink-expand steps in order to solve the graph fully at lower resolution levels, and uses these results to initialize higher resolution graphs closer to convergence. This improves the overall convergence time for the operation.

Related Publications

  • Aditya Deshpande and P. J. Narayanan - Can GPUs Sort Strings Efficiently ? Proceedings of the IEEE Conference on High Performance Computing, 18-21 Dec. 2013, Bangalore, India. [PDF]

  • Harshit Surekha and P J NarayananMixed-Resolution Patch-Matching Proceedings of 12th European Conference on Computer Vision, 7-13 Oct. 2012, Vol. ECCV 2012, Part-VI, LNCS 7577, pp. 187-198, Firenze, Italy. [PDF]

  • Parikshit Sakurikar and P J Narayanan - Fast Graph Cuts using Shrink-Expand Reparameterization Proceedings of IEEE Workshop on Applications of Computer Vision 9-11 Jan. 2012, ISSN 1550-5790 E-ISBN 978-1-4673-0232-6, Print ISBN 978-1-4673-0233-3, pp. 65-71, Breckenridge, CO, USA. [PDF]

  • Aditya Deshpande, Ishan Misra and P J Narayanan - Hybrid Implementation of Error Diffusion Dithering Proceedings of 18th International Conference on High Performance Computing 18-21 Dec. 2011, E-ISBN 978-1-4577-1949-3, Print ISBN 978-1-4577-1951-6, pp. 1-10, Bangalore, India. [PDF]

  • K. Wasif Mohiuddin and P.J. Narayanan - Scalable Clustering Using Multiple GPUs Proceedings of International Conference on High Performance Computing 18-21 Dec. 2011, E-ISBN 978-1-4577-1949-3, Print ISBN 978-1-4577-1951-6, Bangalore, India. [PDF]

  • Vibhav Vineet and P. J. Narayanan - CUDA Cuts: Fast Graph Cuts on the GPU Proceedings of CVPR Workshop on Visual Computer Vision on GPUs, 23-28th june, Anchorage, Alaska, USA. IEEE Computer Society 2008 [PDF]

  • Pawan Harish and P.J. Narayanan - Accelerating Large Graph Algorithms on the GPU using CUDA Proc of IEEE International Conference on High Performance Computing (HiPC 2007) Goa, December, 2007. [PDF]

Associated People

  • Aditya Deshpande
  • K. Wasif Mohiuddin
  • Vibhav Vineet
  • Parikshit Sakurikar
  • Ishan Misra
  • Harshit Sureka

Security and Privacy of Visual Data

With a rapid development and acceptablity of computer vision based systems in one's daily life, securing of the visual data has become imperative. Security issues in computer vision primarily originates from the storage, distribution and processing of the personal data, whereas privacy concerns with tracking down of the user's activity. The current methods of securing an online protocol is to apply a cryptographic layer on top of an existing processing modules, thus securing the data against unauthorised third party access. However, this is often not enough to ensure the complete security of the user's privileged information. Through this work we address specific security and privacy concerns of the visual data. We propose application specific, computationally efficient and provably secure computer vision algorithms for the encrypted domain. More specifically we address the following issues:

  • Efficacy: Security should not be at the cost of accuracy.
  • Efficiency: Encryption/Decryption is computationaly expensive. Secure algorithms should be practical.
  • Domain Knowledge: Domain specific algorithms will be more efficient than generic solutions such as SMC.
  • Security: Algorithms need to be provably-secure and meet futuristic requirements.


Private Content Based Image Retrieval

PCBIR IconFor content level access, very often database needs the query as a sample image. However, the image may contain private information and hence the user does not wish to reveal the image to the database. Private Content Based Image Retrieval (PCBIR) deals with retrieving similar images from an image database without revealing the content of the query image. not even to the database server. We propose algorithms for PCBIR, when the database is indexed using hierarchical index structure or hash based indexing scheme. Experiments are conducted on real datasets with popular features and state of the art data structures. It is observed that specialty and subjectivity of image retrieval (unlike SQL queries to a relational database) enables in computationally efficient yet private solutions.

[Project Homepage]

Blind Authentication: A Crypto-Biometric Verification ProtocolBA Icon

Biometric authentication provides a secure, non-repudiable and convenient method for identity verification. Hence they are ideal to be deployed in both high security as well as remote authentication applications. However, the assertions on security and non-repudiation are valid only if the integrity of the overall system is maintained. A hacker who gains physical or remote access to the system can read or modify the stored templates and successfully pose as or deny access to legitimate users. We propose a secure biometric authentication protocol over public networks using asymmetric encryption, which captures the advantages of biometric authentication as well as the security of public key cryptography. Blind Authentication provides non-repudiable identity verification, while not revealing any additional information about the user to the server or vice versa.

Privacy Preserving Video Surveillance


Widespread use of surveillance cameras in offices and other business establishments, pose a significant threat to the privacy of the employees and visitors. The challenge of introducing privacy and security in such a practical surveillance system has been stifled by the enormous computational and communication overhead required by the solutions. In this work, we propose to utilize some of the inherent properties of the image data to enable efficient and provably secure surveillance algorithms. Our method enables distributed secure processing and storage, while retaining the ability to reconstruct the original data in case of a legal requirement. Our proposed paradigm is highly secure and extreamly fast over the traditional SMC, making privacy preserving surveillance practical.

[Project Homepage]


Fast and Secure Video Encryption

In the recent years, there has been tremendous growth in the areas like networking, digital multimedia, etc., which made multimedia distribution much simpler, for many fascinating applications. So, Businesses and other organizations are now able to perform real-time audio and video conferencing, even over a non-dedicated channel. An eavesdropper can conveniently intercept and capture the sensitive and valuable multimedia content travelling in a public channel. Hence, multimedia security is needed for commerce. This work is focused on proposing new techniques for the security of the video data especially for real-time applications. The major challenges in developing an ideal video encryption algorithm are providing good security against different types of security attacks, no overhead on the MPEG compression process and less encryption time in order to support real-time transfer of the videos. Brief project details.

Related Publications

  • Maneesh Upmanyu, Anoop M. Namboodiri, K. Srinathan and C. V. Jawahar - Efficient Biometric Verification in Encrypted Domain Proceedings of the 3rd International Conference on Biometrics (ICB 2009), pp. 899-908, June . 2-5, 2009, Alghero, Italy. [PDF]

  • Maneesh Upmanyu, Anoop M. Namboodiri, K. Srinathan and C.V. Jawahar - Efficient Privacy Preserving Video Surveillance Poceedings of the 12th International Conference on Computer Vision (ICCV), 2009, Kyoto, Japan [PDF]

  • C. Narsimha Raju, Gangula Umadevi, Kannan Srinathan and C. V. Jawahar - Fast and Secure Real-Time Video Encryption IEEE Sixth Indian Conference on Computer Vision, Graphics & Image Processing (ICVGIP 2008), pp. 257-264, 16-19 Dec,2008, Bhubaneswar, India. [PDF]

  • C. Narsimha Raju, UmaDevi Ganugula, Srinathan Kannan and C.V. Jawahar - A Novel Video Encryption Technique Based on Secret Sharing Proc. of IEEE International Conference on Image Processing(ICIP),Oct 12-15, 2008,San Diego, USA. [PDF]

  • Shashank J, Kowshik P, Kannan Srinathan and C.V. Jawahar - Private Content Based Image Retrieval Proceedings of IEEE computer society conference on Computer Vision and Pattern Recognition (CVPR) 2008, Egan Convention Center, Anchorage, Alaska, June 24-26, 2008. [PDF]

  • C. Narsimha Raju, Kannan Srinathan and C. V. Jawahar - A Real-Time Video Encryption Exploiting the Distribution of the DCT coefficients IEEE TENCON, November 18-21,2008, Hyderabad, India. [PDF]

 Associated People

  • Dr. Kannan Srinathan

Learning Appearance Models


Our reseach focuses on learning appearance models from images/videos that can be used for a variety of tasks such as recognition, detection and classification etc. Prior information such as geometry and kinematics is used to improve the quality of appearance models learnt thus enabling better performance at these tasks.

dynamic activities

Dynamic Activity Recognition

Many of the human activities such as Jumping, Squatting have a correlated spatiotemporal structure. They are composed of homogeneous units. These units, which we refer to as actions, are often common to more than one activity. Therefore, it is essential to have a representation which can capture these activities effectively. To develop this, we model the frames of activities as a mixture model of actions and employ a probabilistic approach to learn their low-dimensional representation. We present recognition results on seven activities performed by various individuals. The results demonstrate the versatility and the ability of the model to capture the ensemble of human activities.

eigen spaces

Boosting Appearance Models using Geometry

We developed novel method to construct an eigen space representation from limited number of views, which is equivalent to the one typically obtained from large number of images. This procedure implicitly incorporates a novel view synthesis algorithm in the eigen space construction process. Inherent information in an appearance representation is enhanced using geometric computations. We experimentally verify the performance for orthographic, affine and projective camera models. Recognition results on the COIL and SOIL image database are promising.

Face Video Manipulation using Tensorial Fatorization

expression transfer

We use Tensor Factorization for manipulating videos of human faces. Decomposition of a video represented as a tensor into non-negative rank-1 factors results in sparse and separable factors equivalent to a local parts decomposition of the object in the video. Such a decomposition can be used for tasks like expression transfer and face morphing. For instance, given a facial expression video it can be represented as a tensor which can then be factorized. The factors that best represent the expression can be identied which can then be transfered to another face video thus transferring the expression. A good solution to the problem of expression transfer would require explicit modeling of the expression and its interaction with the underlying face content. Instead the method proposed here is purely appearance based and the results demonstrate that the proposed method is a simple alternative to the popular complex solution.

Related Publication

  • S. Manikandan, Ranjeeth Kumar and C.V. Jawahar - Tensorial Factorization Methods for Manipulation of Face Videos, The 3rd International Conference on Visual Information Engineering 26-28 September 2006 in Bangalore, India. [PDF]

  • Ranjeeth Kumar, S. Manikandan and C. V. Jawahar - Task Specific Factors for Video Characterization, 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.376-387, 2006. [PDF]

  • Paresh K. Jain, Kartik Rao P. and C. V. Jawahar - Computing Eigen Space from Limited Number of Views for Recognition, 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.662-673, 2006. [PDF]

  • S. S. Ravi Kiran, Karteek Alahari and C. V. Jawahar, Recognizing Human Activities from Constituent Actions, Proceedings of the National Conference on Communications (NCC), Jan. 2005, Kharagpur, India, pp. 351-355. [PDF]



Associated People

Projected Texture for 3D Object Recognition

Introductiondeformation geometry

Three dimensional object are characterized by their shape, which can be thought of as the variation in depth over the object, from a particular view point. These variations could be deterministic as in the case of rigid objects or stochastic for surfaces containing a 3D texture. These depth variations are lost during the process of imaging and what remains is the intensity variations that are induced by the shape and lighting, as well as focus variations. Algorithms that utilize 3D shape for classification tries to recover the lost 3D information from the intensity or focus variations or using additional cues from multiple images, structured lighting, etc. This process is computationally intensive and error prone. Once the depth information is estimated, one needs to characterize the object using shape descriptors for the purpose of classification.

Image-based classification algorithms tries to characterize the intensity variations of the image of the object for recognition. As we noted, the intensity variations are affected by the illumination and pose of the object. The attempt of such algorithms is to derive descriptors that are invariant to the changes in lighting and pose. Although image based classification algorithms are more efficient and robust, their classification power is limited as the 3D information is lost during the imaging process.

pattern shift1We propose the use of structured lighting patterns, which we refer to as projected texture, for the purpose of object recognition. The depth variations of the object induces deformations in the projected texture, and these deformations encode the shape information. The primary idea is to view the deformation pattern as a characteristic property of the object and use it directly for classification instead of trying to recover the shape explicitly. To achieve this we need to use an appropriate projection pattern and derive features that sufficiently characterize the deformations. The patterns required could be quite different depending on the nature of object shape and its variation across objects.





3D Texture Classification

A feature "Normalized Histogram of Derivative of Gradients (NHoGD) is proposed to capture deformation statistic for parallel projection patterns.

Gradient directions in images are the directions of maximal intensity variation. In our scenario, the gradient directions can indicate the direction of the projected lines. As the lines get deformed with surface height variations, we compute the differential of the gradient directions in both x and y axes to measure the rate at which the surface height varies. The derivatives of gradients are computed at each pixel in the image, and the texture is characterized by a Histogram of the Derivatives of Gradients (HoDG). The gradient derivative histogram is a good indicator of the nature of surface undulations in a 3D texture. For classification, we treat the histogram as a feature vector to compare two 3D textures. As the distance computation involves comparing corresponding bins from different images, we normalize the counts in each bin of the histogram across all the samples in the training set. This normalization allows us to treat the distance between corresponding bins between histograms, equally, and hence employ the Euclidean distance for comparison of histograms. The Normalized histograms, or NHoDG is a simple but extremely effective feature for discriminating between different texture classes. Figure on right illustrates the computation of the NHoDG feature from a simple image with bell shaped intensity variation.




imgCategory Recognition for Rigid Objects

The primary concerns in developing a representation for object category is that the description should be invariant to both shape and pose of the object. Note that the use of projected patterns allows us to avoid object texture, and concentrate only on its shape. Approaches such as ’bag of words’ computed from interest points have been successfully employed for image based object category recognition.

Our approach is similar in spirit to achieve pose invariance. We learn the class of local deformations that are possible for each category of objects by creating a codebook of such deformations from a training set. Each object is then represented as a histogram of local deformations based on the codebook. Figure on left illustrates the computation of the feature vector from a scene with projected texture. There are two primary concerns to be addressed while developing a parts based shape representation: The location of points from which the local shape descriptor is computed is important to achieve position invariance. In image based algorithms, the patches are localized by using an interest operator that is computed from object texture or edges. However, in our case the primary objective is to avoid using texture information and concentrate on the shape information provided by the projected texture. Hence we choose to use a set of overlapping windows that covers the whole scene for computation of local deformations. Our representation based on the codebook allows us to concentrate on the object deformation for recognition. The description of the local deformations should be sufficient to distinguish between various local surface shapes within the class of objects. The feature vector used exploits the periodic nature of projected patterns. Since Fourier representation is an effective descriptor for periodic signals, and since we are interested in the nature of deformation and not its exact location, we compute magnitude or the absolute value of the Fourier coefficients (AFC) corresponding to each of the window patch as our feature vector. To make comparisons in a Euclidean space for effective, we use a logarithmic representation of these coefficients (LAFC). We show that this simple Fourier magnitude based representation of the patches can effectively achieve the discriminative power that we seek. The feature extraction process proceeds as follows: The images in the training set are divided into a set of overlapping windows of size 20×20 (decided experimentally). Each window is then represented using the magnitude of Fourier representation in logarithmic scale. This results in a 200 dimensional feature vector (due to symmetry of Fourier representation) for each window. A K-means clustering of windows in this feature space allows us to identify the dominant pattern deformations, which forms a codebook


Recognition of Aligned Deterministic shapes

We have taken the example of hand geometry based person authentication for demonstrating our approach. We have collected dataset of 181 user with peg based alignment. We divided the hand image into a set of non-overlapping sub-windows, and compute the local textural characteristics of each window using a filter bank of 24 Gabor filters with 8 orientations and 3 scales (or frequencies).

Related Publications

  • Avinash Sharma, Nishant Shobhit and Anoop M. Namboodiri - Projected Texture for Hand Geometry based Authentication Proceedings of CVPR Workshop on Biometrics, 28 June, Anchorage, Alaska, USA. IEEE Computer Society 2008. [PDF]


  • Avinash Sharma, Anoop Namboodiri - Projected Texture for classification of 3D Texture Surface, Submitted to ECCV 2008 (Results awaited)
  • Avinash Sharma, Anoop Namboodiri - Object Category Recognition with Projected Texture, Submitted to ICPR 2008 (Results awaited)

  • Avinash Sharma - A Technical Report on Projected Texture for Object Recognition

Associated People

Biological Vision


The perceptual mechanisms used by different organisms to negotiate the visual world are fascinatingly diverse. Even if we consider only the sensory organs of vertebrates, such as eye, there is much variety. Several disciplines have approached the problem of investigating how sensory, motors and central visual systems function and are oganised. Area of biological vision aims to build a computational understanding of various brian mechanisms. Synergy between biological and computer vision research can be found in low-level vision. Substantial insights about the processes for extracting colour, edge, motion and spatial frequency information from images have come from combining computational and neuro-physiological constraints. Understanding of human perception/vision is said to be an early step towards indetifying objects and understanding of scene.

Work Undertaken


:: Towards Understanding Texture Processing :: 

A fundamental goal of texture research is to develop automated computational methods for retrieving visual information and understanding image content based on textural properties in images. A synergy between biological and computer vision research in low-level vision can give substantial insights about the processes for extracting color, edge, motion, and spatial frequency information from images. In this thesis, we seek to understand the texture processing that takes place in low level human vision in order to develop new and effective methods for texture analysis in computer vision. The different representations formed by the early stages of HVS and visual computations carried out by them to handle various texture patterns is of interest. Such information is needed to identify the mechanisms that can be use in texture analysis tasks. (more detail...)

:: Biologically Inspired Interest Point Operator ::
poich1Interest point operators (IPO) are used extensively for reducing computational time and improving the accuracy of several complex vision tasks such as object recognition or scene analysis. SURF, SIFT, Harris,Corner points etc., are popular examples. Though there exists a large number of IPOs in the vision literature, most of them rely on low level features such as color,edge orientation etc., making them sensitive to degradation in the images.

Human vision systems (HVS) perform these tasks with seemingly little effort and are robust to such degradation by employing spatial attention mechanisms to reduce the computational burden. Extensive studies of these spatial attention mechanisms have led to several computational models (eg. Itti, Koch). However, very few models have found successful applications in computer vision related tasks partly owing to their prohibitive

Computational attention systems either have used top-down or bottom-up information. Using both types of information is an attractive choice for top-down knowledge is quite helpful particularly when images are degraded [Antonio Torralba]. Our work is focused on developing a robust biologically-inspired IPO capable of utilizing top-down knowledge. The operator will be tested as a feature detector /descriptor for a Monocular Visual SLAM.

Antonio Torralba, Contextual Priming for Object Detection, IJCV,Vol.53, 2,2003,Pages:169--191.
Laurent Itti, Christof Koch, A saliency based search mechanism for overt and covert shifts of visual attention,Vision Research,Vol.40,2000,Pages: 1489--1506

Current under-going Projects

  • Medical Image Reconstruction on Hexagonal Grid
  • Computational Understanding of Medical Image Interpretation by Expert

Related Publications

  • N.V. Kartheek Medathati, Jayanthi Sivaswamy - Local Descriptor based on Texture of Projections Proceedings of Seventh Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP'10),12-15 Dec. 2010,Chennai, India. [PDF]

  • Joshi Datt Joshi, Saurabh Garg and Jayanthi Sivaswamy - Script Identification from Indian Documents, Proceedings of IAPR Workshop on Document Analysis Systems (DAS 2006), Nelson, pp.255-267. [PDF]

  • Gopal Datt Joshi, Saurabh Garg and Jayanthi Sivaswamy - A Generalised Framework for Script Identification Proc. of International Journal for Document Analysis and Recognition(IJDAR), 10(2), pp.55-68, 2007. [PDF]

  • Gopal Datt Joshi and Jayanthi Sivaswamy - A Computational Model for Boundary Detection, 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.172-183, 2006. [PDF]

  • Gopal Datt Joshi, and Jayanthi Sivaswamy - A Simple Scheme for Contour Detection, Proceedings of International Conference on Computer Vision and Applications (VISAP 2006), Setubal. [PDF]

  • L.Middleton and J. Sivaswamy, Hexagonal Image Processing, Springer Verlag, London, 2005, ISBN: 1-85233-914-4. [PDF]

  • Gopal Datt Joshi , and Jayanthi Sivaswamy - A Multiscale Approach to Contour Detection, Proceedings of International Conference on Cognition and Recognition ,pp. 183-193, Mysore, 2005. [PDF]

  • L. Middleton and J. Sivaswamy - A Framework for Practical Hexagonal-Image Processing, Journal of Electronic Imaging, Vol. 11, No. 1, January 2002, pp. 104--114. [PDF]


Associated People