Thesis Students

Patient-Motion Analysis in Perfusion Weighted MRI.

Rohit Gautam (homepage)

Information about blood flow in the brain is of interest to detect the presence of blockages and ruptures in the vessel network. A standard way of gathering this information is to inject a bolus of contrast agent into the blood stream and imaging over a period of time. The imaging is generally done over an extended period of time (tens of minutes) during which a patient can move which in turn results in corruption of the acquired time series of volumes. This problem is often observed in dynamic magnetic resonance (MR) imaging. Correction for motion after scanning is a highly time-intensive process since it involves registering each volume to a reference volume. Moreover, the injected contrast alters the signal intensity as a function of time and often confounds traditional motion correction algorithms. In this thesis, we present a fast and efficient solution for motion correction in 3D dynamic susceptibility contrast (DSC) MR images. We present a robust, multi-stage system based on a divide and conquer strategy consisting of the following steps: i) subdivision of the time series data into bolus and non-bolus phases depending on the status of bolus in the brain, ii) 2D block-wise phase correlation for detecting motion between adjacent volumes and categorizing the corruption into four categories: none, minimal, mild and severe depending on the degree of motion and iii) a 2-pass, 3D registration consisting of intra-set and inter-set registrations to align the motion corrupted volumes. The subdivision of time-series into distinct sets is achieved using Gamma variate function (GVF) fitting. The dynamic non-uniform variation in signal intensity due to the injected bolus is handled by employing a clustering-based identification of bolus-affected pixels followed by correction of their intensity using the above GVF fitting.

The proposed system was evaluated on a real DSC MR sequence by introducing motion of varying degrees. The experimental results show that the entropy of the derived motion fields is a good metric for detecting and categorizing the motion. The evaluation of motion correction using the dice coefficient measure shows that the system is able to remove motion accurately and efficiently. The efficiency is contributed to by the proposed detection as well as the correction strategy. Including the detection prior to existing correction methods achieved a savings of 37% in computation time. Whereas, when the detection is combined with the proposed correction stage, the savings increase to 63%. Notably, the above performance was found to be had with no trade-off between accuracy and computation cost. (more...)

Year of completion:	October 2013
Advisor :	Jayanthi Sivaswamy

Related Publications

Downloads

Bag of Words and Bag of Parts models for Scene Classification in Images.

Mayank Juneja (homepage)

Scene Classification has been an active area of research in Computer Vision. The goal of scene classification is to classify an unseen image into one of the scene categories, e.g. beach, cityscape, auditorium, etc. Indoor scene classification in particular is a challenging problem because of the large variations in the viewpoint and high clutter in the scenes. The examples of indoor scene categories are corridor, airport, kitchen, etc. The standard classification models generally do not work well for indoor scene categories. The main difficulty is that while some indoor scenes (e.g. corridors) can be well characterized by global spatial properties, others (e.g. bookstores) are better characterized by the objects they contain. The problem requires a model that can use a combination of both the local and global information in the images. Motivated by the recent success of the Bag of Words model, we apply the model specifically for the problem of Indoor Scene Classification. Our well-designed Bag of Words pipeline achieves the state-of-the-art results on the MIT 67 indoor scene dataset, beating all the previous results. Our Bag of Words model uses the best options for every step of the pipeline. We also look at a new method for partitioning of images into spatial cells, which can be used as an extension to the standard Spatial Pyramid Technique (SPM). The new partitioning is designed for scene classification tasks, where a non-uniform partitioning based on the different regions is more useful than the uniform partitioning.

We also propose a new image representation which takes into account the discriminative parts from the scenes, and represents an image using these parts. The new representation, called Bag of Parts can discover parts automatically and with very little supervision. We show that the Bag of Parts representation is able to capture the discriminative parts/objects from the scenes, and achieves good classification results on the MIT 67 indoor scene dataset. Apart from getting good classification results, these blocks correspond to semantically meaningful parts/objects. This mid-level representation is more understandable compared to the other low-level representations (e.g. SIFT) and can be used for various other Computer Vision tasks too. Finally, we show that the Bag of Parts representation is complementary to the Bag of Words representation and combining the two gives an additional boost to the classification performance. The combined representation establishes a new state-of-the-art benchmark on the MIT 67 indoor scene dataset. Our results outperform the previous state-of-the-art results by 14%, from 49.40% to 63.10%. (more...)

Year of completion:	October 2013
Advisor :	C. V. Jawahar & Andrew Zisserman

Related Publications

Mayank Juneja, Andrea Vedaldi, C V Jawahar and Andres Zisserman - Blocks that Shout: Distinctive Parts for Scene Classification Proceedings of the International Conference on Computer Vision and Pattern Recognition, 23-28 June. 2013, Oregon, USA. [PDF]
Abhinav Goel, Mayank Juneja and C V Jawahar - Are Buildings Only Instances? Exploration in Architectural Style Categories Proceedings of the 8th Indian Conference on Vision, Graphics and Image Processing, 16-19 Dec. 2012, Bombay, India. [PDF]

Downloads

Analysis of Stroke on Brain Computed Tomography Scans.

Saurabh Sharma

Abstract Stroke is one of the leading causes of death and disability in the world. Early detection of Stroke (both hemorrhagic and ischemic) is very important as it can ensure up to full recovery. Timely detection of stroke, especially ischemic stroke is difficult as the changes in abnormal tissue only become visible after the damage has already been done. The detection is even more difficult on CT scan compared to other imaging modalities but the dependence of a large fraction of population on CT, makes the need to find a solution to the problem even more imperative. Though the detection accuracy of radiologists for early stroke depends on various factors like experience, available technology, etc., earlier estimates put the accuracy around 10% [45]. Even with considerable advancement in CT technology the performance has still only increased to around 70% or thereabouts [21]. Any kind of assistance to radiologists which can improve their detection accuracy would therefore be much appreciated.

This thesis presents a framework for automatic detection and classification of different types of stroke. We characterize stroke as a distortion in the otherwise contralaterally similar distribution of brain tissue. Classification depends on the severity of the distortion with hemorrhage and chronic infarcts exhibiting the maximum distortion and hyperacute stroke showing the minimum. The detection work on hemorrhagic stroke and early ischemic stroke has clinical value whereas the work on later stages of ischemic stroke has mainly academic use. The automatic detection approach was tested on a dataset containing 19 normal (291 slices) and 23 abnormal (181 slices) datasets. The algorithm gave a high recall rate for hemorrhage (80%), chronic (95%), acute (91.80%) and hyperacute (82.22%) stroke at slice level. The corresponding precision figures were 93.3%, 90.47%, 87.5% and 69.81% respectively. The performance of the system in a normal vs. stroke-affected scenario was 83.95% precision and 86.74% recall. The lower precision value in case of hyperacute scans is because of large number of normal slices with slight disturbances in contra-lateral symmetry being identified as stroke cases. We also present a novel approach for enhancement of early ischemic stroke regions using image-adaptive window parameters, to aid the radiologists in the manual detection of early ischemic stroke. The enhancement approach increased the average accuracy of radiologists in clinical conditions from around 71% to around 90% (p=0.02, two tailed student's t test) with the inexperienced radiologists benefiting more from the enhancement. The average reviewing time of the scans was also reduced from about 9 to 6 seconds per slice. Out of the two approaches, automatic detection and enhancement, results show the enhancement process to be more promising. (more...)

Year of completion:	October 2013
Advisor :	Jayanthi Sivaswamy

Related Publications

Saurabh Sharma, Sivaswamy Sivaswamay, Power Ravuri and L.T. Kishore - Assisting Acure Infarct Detection from Non-contract CT using Image Adaptive Window Setting Proceedings of 14th Conference on Medical Image Perception (MIPS 2011),09-11 Aug. 2011, Dublin, Ireland. [PDF]
Mayank Chawla, Saurabh Sharma, Jayanthi Sivaswamy and Kishore L.T - A Method for Automatic Detection and Classification of Stroke from Brain CT Images Proceedings of 31st Annual International Conference of the IEEE Engineering in Medicine and Biology Society(EMBC 09), 2-6 September, 2009, Minneapolis, USA. [PDF]

S. Sharma, J. Sivaswamy - Automatic detection of early infarct from brain CT International Symposium on Medical Imaging in conjunction with ICVGIP, 2010 [PDF]

Downloads

Multi-modal Semantic Indexing for Image Retrieval.

Pulla Lakshmi Chandrika (homepage)

Many image retrieval schemes generally rely only on a single mode, (either low level visual features or embedded text) for searching in multimedia databases. In text based approach, the annotated text is used for indexing and retrieval of images. Though they are very powerful in matching the context of the images but the cost of annotation is very high and the whole process suffers from the subjectivity of descriptors. In content based approach, the indexing and retrieval of images is based on the visual content of the image such as color, texture, shape, etc. While these methods are robust and effective they are still bottlenecked by semantic gap. That is, there is a significant gap between the high-level concepts (which human perceives) and the low-level features (which are used in describing images). Many approaches(such as semantic analysis) have been proposed to bridge this semantic gap between numerical image features and richness of human semantics

Semantic analysis techniques were first introduced in text retrieval, where a document collection can be viewed as an unsupervised clustering of the constituent words and documents around hidden or latent concepts. Latent Semantic Indexing (LSI), probabilistic Latent Semantic Analysis (pLSA), Latent Dirichlet Analysis (LDA) are the popular techniques in this direction. With the introduction of bag of words (BoW) methods in computer vision, semantic analysis schemes became popular for tasks like scene classification, segmentation and content based image retrieval. This has shown to improve the performance of visual bag of words in image retrieval. Most of these methods rely only on text or image content.

Many popular image collections (eg. those emerging over Internet) have associated tags, often for human consumption. A natural extension is to combine information from multiple modes for enhancing effectiveness in retrieval.

The enhancement in performance of semantic indexing techniques heavily depends on the right choice of number of semantic concepts. However, all of them require complex mathematical computations involving large matrices. This makes it difficult to use it for continuously evolving data, where repeated semantic indexing (after addition of every new image) is prohibitive. In this thesis we introduce and extend, a bipartite graph model (BGM) for image retrieval. BGM is a scalable datastructure that aids semantic indexing in an efficient manner. It can also be incrementally updated. BGM uses tf-idf values for building a semantic bipartite graph. We also introduce a graph partitioning algorithm that works on the BGM to retrieve semantically relevant images from a database. We demonstrate the properties as well as performance of our semantic indexing scheme through a series of experiments.

Then , we propose two techniques: Multi-modal Latent Semantic Indexing (MMLSI) and Multi-Modal Probabilistic Latent Semantic Analysis (MMpLSA). These methods are obtained by directly extending their traditional single mode counter parts. Both these methods incorporate visual features and tags by generating simultaneous semantic contexts. The experimental results demonstrate an improved accuracy over other single and multi-modal methods.

We also propose, a tri-partite graph based representation of the multi model data for image retrieval tasks. Our representation is ideally suited for dynamically changing or evolving datasets, where repeated semantic indexing is practically impossible. We employ a graph partitioning algorithm for retrieving semantically relevant images from the database of images represented using the tripartite graph. Being "just in time semantic indexing", our method is computationally light and less resource intensive. Experimental results show that the data structure used is scalable. We also show that the performance of our method is comparable with other multi model approaches, with significantly lower computational and resources requirements (more...)

Year of completion:	December 2013
Advisor :	C. V. Jawahar

Related Publications

Chandrika Pulla and C.V. Jawahar - Tripartite Graph Models for Multi Modal Image Retrieval Proceedings of 21st British Machine Vision Conference (BMVC'10),31 Aug. - 3 Sep. 2010, Aberystwyth, UK. [PDF]
Chandrika Pulla, Suman Karthik and C.V. Jawahar - Efficient Semantic Indexing for Image Retrieval Proceedings of 20th International Conference on Pattern Recognition (ICPR'10),23-26 Aug. 2010, Istanbul, Turkey. [PDF]
Pulla Chandrika and C.V. Jawahar - Multi Modal Semantic Indexing for Image Retrieval Proceedings of ACM International Conference on Image and Video Retrieval(CIVR'10), pp.342-349, 5-7 July, 2010, Xi'an, China. [PDF]
Suman Karthik, Chandrika Pulla and C.V. Jawahar - Incremental Online Semantic Indexing for Image Retrieval in Dynamic Databases Proceedings of International Workshop on Semantic Learning Applications in Multimedia (SLAM: CVPR 2009), 20-25 June 2009, Miami, Florida, USA. [PDF]

Downloads

Minutiae Local Structures for Fingerprint Indexing and Matching

Akhil Vij (homepage)

Human beings use specific characteristics of people such as their facial features, voice and gait to recognize people who are familiar to us in our daily life. The fact that many of the physiological and behavioral characteristics are sufficiently distinctive and can be used for automatic identification of people has led to the emergence of \emph{biometric recognition} as a prominent research field in recent years. Several biometric technologies have been developed and successfully deployed around the world such as fingerprints, face, iris, palmprint, hand geometry, and signature. Out of all these biometric traits, fingerprints are the most popular because of their ease of capture, distinctiveness and persistence over time, as well as the low cost and maturity of sensors and algorithms.

This thesis is focused on improving the efficiency of fingerprint recognition systems using local minutiae based features. Initially, we tackle the problem of large scale fingerprint matching called fingerprint identification. Large size of databases (sometimes containing billions of fingerprints) and significant distortions between different impressions of the same finger are some of the major challenges in identification. A naive solution involves explicit comparison of a probe fingerprint image/template against each of the images/templates stored in the database. A better approach to speed up this process is to index the database, where a light-weight comparison is used to reduce the database to a smaller set of candidates for detailed comparison.

In this thesis, we propose a novel hash-based indexing method to speed up fingerprint identification in large databases. For each minutia point, its local neighborhood information is computed with features defined based on the geometric arrangements of its neighboring minutiae points. The features proposed are provably invariant to distortions such as translation, rotation and scaling. These features are used to create an affine invariant local descriptor called an Arrangement Vector, which completely describes the local neighborhood of a minutiae point. To account for missing and spurious minutiae, we consider subsets of the neighboring minutiae and hashes of these structures are used in the indexing process. Experiments conducted on FVC 2002 databases show that the approach is quite effective and gives better results than the existing state-of-the-art approach using similar affine features.

We then extend our indexing framework to solve the problem of matching of two fingerprints. We extend the proposed arrangement vector by adding more features to it and making it more robust. We come up with a novel fixed-length descriptor for a minutia that captures its distinctive local geometry. This distinctive representation of each minutiae neighborhood allows us to compare two minutiae points and determine their similarity. Given a fingerprint database, we then use unsupervised K-means clustering to learn prominent neighborhoods from the database. Each fingerprint is represented as a collection of these prominent neighborhoods. This allows us to come up with a binary fixed length representation for a fingerprint that is invariant to global distortions, and handle small local non-linear distortions. The representation is also robust to missing or spurious minutiae points. Given two fingerprints, we represent each of them as fixed length binary vectors. The matching problem then reduces to a sequence of bitwise operations, which is very fast and can be easily implemented on smaller architectures such as smart phones and embedded devices. We compared our results with the two existing state-of-the-art fixed length fingerprint representations from the literature, which demonstrates the superiority of the proposed representation.

In addition, the proposed representation can be derived using only the minutiae positions and orientation of a fingerprint. This makes it applicable to existing template databases that often contain only this information. Most of the other existing methods in the literature use some additional information such as orientation flow and core points, which need the original image for computation. The new proposed binary representation is also suitable for biometric template protection schemes and is small enough to be stored on smart cards. (more...)

Year of completion:	August 2012
Advisor :	C. V. Jawahar

Related Publications

Akhil Vij, Anoop Namboodiri - Fingerprint Indexing Based on Local Arrangements of Minutiae Neighborhoods IEEE Computer Vision and Pattern Recognition Workshops (CVPRW), June, 2012 [PDF]

Patient-Motion Analysis in Perfusion Weighted MRI.

Related Publications

Downloads

Bag of Words and Bag of Parts models for Scene Classification in Images.

Related Publications

Downloads

Analysis of Stroke on Brain Computed Tomography Scans.

Related Publications

Downloads

Multi-modal Semantic Indexing for Image Retrieval.

Related Publications

Downloads

Minutiae Local Structures for Fingerprint Indexing and Matching

Related Publications

Downloads

More Articles …