CVIT Home CVIT Home
  • Home
  • People
    • Faculty
    • Staff
    • PhD Students
    • MS Students
    • Alumni
    • Post-doctoral
    • Honours Student
  • Research
    • Publications
    • Thesis
    • Projects
    • Resources
  • Events
    • Talks and Visits
    • Major Events
    • Visitors
    • Summer Schools
  • Gallery
  • News & Updates
    • News
    • Blog
    • Newsletter
    • Banners
  • Contact Us
  • Login

A Study of X-Ray Image Perception for Pneumoconiosis Detection


Varun Jampani

Pneumoconiosis is an occupational lung disease caused by the inhalation of industrial dust. Despite the increasing safety measures and better work place environments, pneumoconiosis is deemed to be the most common occupational disease in the developing countries like India and China. Screening and assessment of this disease is done through radiological observation of chest x-rays. Several studies have shown the significant inter and intra reader observer variation in the diagnosis of this disease, showing the complexity of the task and importance of the expertise in diagnosis.

The present study is aimed at understanding the perceptual and cognitive factors affecting the reading of chest x-rays of pneumoconiosis patients. Understanding these factors helps in developing better image acquisition systems, better training regimen for radiologists and development of better computer aided diagnostic (CAD) systems. We used an eye tracking experiment to study the various factors affecting the assessment of this diffused lung disease. Specifically, we aimed at understanding the role of expertize, contralateral symmetric (CS) information present in chest x-rays on the diagnosis and the eye movements of the observers. We also studied the inter and intra observer fixation consistency along with the role of anatomical and bottom up saliency features in attracting the gaze of observers of different expertize levels, to get better insights into the effect of bottom up and top down visual saliency on the eye movements of observers.

The experiment is conducted in a room dedicated to eye tracking experiments. Participants consisting of novices (3), medical students (12), residents (4) and staff radiologists (4) were presented with good quality PA chest X-rays, and were asked to give profusion ratings for each of the 6 lung zones. Image set consisting of 17 normal full chest x-rays and 16 single lung images are shown to the participants in random order. Time of the diagnosis and the eye movements are also recorded using a remote head free eye tracker.

chest heatmap chest original

Results indicated that Expertise and CS play important roles in the diagnosis of pneumoconiosis. Novices and medical students are slow and inefficient whereas, residents and staff are quick and efficient. A key finding of our study is that the presence of CS information alone does not help improve diagnosis as much as learning how to use the information. This learning appears to be gained from focused training and years of experience. Hence, good training for radiologists and careful observation of each lung zone may improve the quality of diagnostic results. For residents, the eye scanning strategies play an important role in using the CS information present in chest radiographs; however, in staff radiologists, peripheral vision or higher-level cognitive processes seems to play role in using the CS information.

There is a reasonably good inter and intra observer fixation consistency suggesting the use of similar viewing strategies. Experience is helping the observers to develop new visual strategies based on the image content so that they can quickly and efficiently assess the disease level. First few fixations seem to be playing an important role in choosing the visual strategy, appropriate for the given image.

Both inter-rib and rib regions are given equal importance by the observers. Despite reading of chest x-rays being highly task dependent, bottom up saliency is shown to have played an important role in attracting the fixations of the observers. This role of bottom up saliency seems to be more in lower expertize groups compared to that of higher expertize groups. Both bottom up and top down influence of visual fixations seems to change with time. The relative role of top down and bottom up influences of visual attention is still not completely understood and it remains the part of future work.

Based on our experimental results, we have developed an extended saliency model by combining the bottom up saliency and the saliency of lung regions in a chest x-ray. This new saliency model performed significantly better than bottom-up saliency in predicting the gaze of the observers in our experiment. Even though, the model is a simple combination of bottom-up saliency maps and segmented lung masks, this demonstrates that even basic models using simple image features can predict the fixations of the observers to a good accuracy.

Experimental analysis suggested that the factors affecting the reading of chest x-rays of pneumoconiosis are complex and varied. A good understanding of these factors definitely helps in the development of better radiological screening of pneumoconiosis through improved training and also through the use of improved CAD tools. The presented work is an attempt to get insights into what these factors are and how they modify the behavior of the observers.

 

Year of completion:  January 2013
 Advisor : Jayanthi Sivaswamy

Related Publications


Downloads

thesis

ppt

Robust Motion Estimation and Analysis based on Statistical Information.


V S Rao Veeravasarapu (homepage)

OF

Accurate and Robust estimation of optical flow continues to be of interest due to the deep penetration of digital cameras into many areas including robot navigation and video surveillance applications. The canonical approach to the flow estimation relies on local brightness constancy which has limitations. In this thesis, we re-examine the optic flow problem and formulate an alternate hypothesis that optical flow is an apparent motion of local information across frames and propose a novel framework to robustly estimate flow parameters. Pixel-level matching approach has been implemented according to the proposed formulation in which optical flow is estimated based on local information associated with each pixel. Self information and a variety of divergence measures have been investigated for capturing the local information. Results of benchmarking with the Middlebury dataset show that the proposed formulation is comparable to the top performing methods in accurate flow computation. The distinguishing aspects however are that these results hold for small as well as large displacements and the flow estimation is robust to distortions such as noise, illumination changes, non-uniform blur etc. Thus, the local information based approach offers a promising alternative to computing optical flow. We also developed a method to remove motion blur from frames by using the information measures. The effectiveness of the proposed motion estimation approach is also demonstrated on extraction of structure from motion of synthetic micro-texture patterns, cardiac ultrasound sequences and colorization of black and white videos. (more...)

 

Year of completion:  March 2013
 Advisor : Jayanthi Sivaswamy

Related Publications


Downloads

thesis

ppt

Registration of Retinal Images.


Yogesh Babu Bathina

Ever so often a need arises in clinical scenarios, for integrating information from multiple images or modalities for the purposes of diagnosis and pathology tracking. Registration, the most fundamental step in such an integration, is the task of spatially aligning a pair of images of the same scene acquiredfrom different sources, viewpoints and time. This thesis concerns the task of registration specific to three most popular retinal imaging modalities namely Color Fundus Imaging (CFI), Red-Free Imaging (RFI) and Fluoroscein Fundus Angiography (FFA). CFI is obtained under white light which enables the experts to examine the overall condition of the retina in full color. In RFI, the illuminating light is fil-tered to remove red color which improves the contrast between vessel and other structures. FFA is a set of time sequence images acquired under infrared light after a fluorescent dye is injected intravenously into the blood stream. This provides high contrast vessel information revealing blood flow dynamics,leaks and blockages.

Retina is a part of the central nervous system (CNS) which is composed of many different types of tissues. Given this distinctive feature, a wide variety of diseases affecting different body systems uniquely affect the retina. These Systemic diseases include Diabetes, Hypertension, Atherosclerosis , Sickle cell disease, Multiple sclerosis to name a few. Recent advancements reveal a close association of retinal vascular signs to cerebrovascular, cardiovascular and metabolic outcomes. Simply put, the health of blood vessels in the eye often indicates the condition of the blood vessels (arteries and veins) throughout the body.

Registration of multimodal retinal images aids in the diagnosis of various kinds of retinal diseases like Glaucoma, Diabetic Retinopathy, Age Related Macular degeneration etc. Single modality images acquired over a period of time are used for pathology tracking. Registration is also used for constructing a mosaic image of the entire retina from several narrow field images, which aids comprehensive retinalexamination. Another key application area for registration is surgery, both in the planning stage and dur-ing surgery for which only optical range information is available. Fusion of these modalities also helps increase the anatomical range of visual inspection, early detection of potentially serious pathologies andassess the relationship between blood flow and the diseases occurring on the surface of the retina.

The task of registering retinal images is challenging given the wide range of pathologies captured via different modalities in different ways, geometric and photometric variation, illumination artifacts, noise and other degradations. Many successful methods have been proposed in the past for the registering retinal images. A review of these methods show good performance over healthy retinal images. How-ever, the scope of handling a wide range of pathologies is limited for most of the approaches. Further, these methods fail to register poor quality images, especially in the multimodal case. In this work, we propose a feature based retinal image registration algorithm capable of handling such challenging image pairs.

At the core of this algorithm is a novel landmark detector and descriptor scheme. A set of landmarks are detected on the topographic surface of retina using Curvature dispersion measure. The descriptor is based on local projections using radon transform which characterizes local structures in an abstract sense rendering it less sensitive to pathologies and noise. Drawing essence from the recent developments in robust estimation methods, a modified MSAC(M-estimators Sample and Consensus) is proposed for false correspondence pruning. On the whole, the minor contributions at each stage of feature based reg-istration scheme presented here are of significance. We evaluate our method against two recent schemes on three different datasets which includes both monomodal and multimodal images. The results show that our method gives better accuracy for poor quality and pathology affected images while performing on par with the existing methods on normal images.

 

Year of completion:  April 2013
 Advisor : Jayanthi Sivaswamy

Related Publications


Downloads

thesis

ppt

Detection and Segmentation of Stroke Lesions from Diffusion Weighted MRI Data of the Brain.


Shashank Mujumdar (homepage)

 

Stroke is a chronic disease which often leads to death. Different medical imaging modalities enable diagnosis for stroke after the onset of symptoms. Time is of the essence during stroke analysis since the window of therapy is very small (< 3 hrs after the onset of symptoms). Recent clinical studies have shown the usefulness and significance of diagnosing stroke on the Diffusion Weighted Magnetic Resonance Imaging (DWI) scans of the brain in the early stages. Visual inspection of the DWI scans is difficult since multiple scans are acquired for a patient with varied contrast and the scans depict complementary information about the diffusion process in the brain. To make matters worse, the DWI scans are acquired at a very low resolution with poor signal to noise ratio (SNR) since the time of acquisition is significantly less (< 1 min) and are confounded by artifacts that mimic stroke lesions. Thus, an automated framework which can accurately capture the stroke lesions in the DWI data would assist the clinicians in a better diagnosis. This is focus of the thesis.

dwi seg dwi org

Varying the acquisition parameter (b-value) generates different DWI scans with varied contrast. DWI with higher b-values provide improved sensitivity, conspicuity of stroke lesions and reduced artifacts at the cost of lower SNR. Along with the DWI scans, the Apparent Diffusion Coefficients (ADC) maps are also derived which give a measure of the true diffusion process in the brain irrespective of the acquisition artifacts that resemble stroke. In this thesis, we argue that integrating information from multiple sources, namely, low and high b-value data along with the ADC maps, can aid better characterization of stroke lesions in the data. Accordingly, we propose a novel approach for detecting and segmenting stroke regions from DWI data.

(more...)

 

Year of completion:  July 2013
 Advisor : Jayanthi Sivaswamy

Related Publications


Downloads

thesis

ppt

Techniques for Organization and Visualization of Community Photo Collections


Kumar Srijan (homepage)

Due to the digital and information revolution we are witnessing presently, there are a huge and continously increasing number of images present on the Internet. For example, a query for ``Eiffel Tower" on Google Images returns more than two million images. The easy accessiblity of this data provides us with unique opportunities to mine the contents of these images not only to do automatic organization, but also for providing interactive interfaces to browse, explore and query. This task is challenging given the massive size and the continous growth of the collection. To add to this, these collections are taken in varying imaging conditions, with different cameras, at different resolutions, from different perspectives and have different degrees of occlusions present in them. Hence, for image collections even the simplest of tasks such as finding matching images turn out to be hard.

The Computer Vision community has been actively designing and redesigning algorithms to overcome these challenges. One of the most widespread and noticable idea employed is that of extracting robust, invariant and repeatable local features in the images, followed by the subsequent quantization of the feature space as visual words. The similarity of images is gauged by the correspondence and similarity of thier local features. Verifications of the matchings is done to eliminate spurious matches. Building a data structure such as an inverted index over these visual words can catalyse the process of discovery of matching features. This mining of similar images by matching features, forms the basis of all high level algorithms such as clustering, skeletonization, summarization etc. which help in the organization, exploration and querying of these image collections. This thesis presents two novel algorithms which help in achieving this goal.

First, we introduce a novel indexing scheme that makes it possible to do exhaustive pairwise matching in large image collections. The quantization of image features and thier indexing provide on a limited amount of leverage for speeding up the image matching process which depends upon the sparsity the posting lists. This sparsity is controlled by the number of visual words used which after a point cannot be increased arbitrarily without affecting recall. Our scheme, generates higher order features by pairing up nearby features and encoding their affine geometry. This provides a much larger feature space to index which can be subseqently reprojected to any desired size by defining appropriate hash functions. We implement our indexing scheme by providing an analogy with Bloom filters. The higher order features extracted in the images are inserted into their respective equally sized Bloom filters using a single hash function. This unformity in Bloom filters allows for only a single inverted index to be able to index the hash buckets of all the Bloom filters, and thus providing a simplified interface to implicity query all the Bloom filters. We choose the size of these Bloom filters to be in proportion to the size of the database. This enables us to do querying in constant time, since the average size of the posting lists becomes constant. Also, the use of such large implicit Bloom filters is able to sufficiently mitigate the negative effects of using a single hash functions. As a result, we are able to do exhaustive pairwise matching over large databases of upto 100K images in linear time complexity.

Second, we present a fast and easy to implement framework for browsing large image collections of landmarks and monumental sites. The existing framework ``Phototourism" would require doing a reconstruction of the whole scene by employing Structure from Motion package called Bundler. This requires pairwise matching required to generate tracks of matching features across images. Next an incremental approach is applied, starting with a seed reconstruction and adding more matching images into the reconstruction. This, however, requires continous refinement of the whole reconstruction using a computationally expensive procedure called bundle adjustment. The pairwise matching and bundle adjustment become the limiting factors in scaling this technique to large image collections.

To overcome the issues faced with ``Phototourism", our framework employs independent partial reconstructions of the scene. We use standard Bag of words model and indexing techniques to determine closest neighbours of each image in the collection, and do a local reconstruction corresponding to each image using only the neighbouring images. This requires us to only solve multiple simple reconstructions problems instead of one large reconstruction problem, making it computationally more tractable. Our browsing interface hops from one reconstruction to another to give the user an illusion of browsing a global reconstruction. Our approach also makes it easy to adapt to growing image collections, as adding an image only incurs a cost of creating a new independent reconstruction. We validate our approach with a Golkonda Fort image dataset consisting of 6K images.

In summary, the techniques presented in this thesis for organizing large image collections tries to solve the problem of doing exhausitive pairwise matching in image collection in a scalable manner, for which a novel indexing scheme is proposed. We also present a novel technique for overcoming the problems faced while doing ``Structure from Motion'' on large image collections. We hope that these techniques will find application for browsing and mining matching images in large image collections, and also in creating virtual experiences of several monuments and sites across the globe. (more...)

 

Year of completion:  July 2013
 Advisor : C. V. Jawahar

Related Publications


Downloads

thesis

ppt

More Articles …

  1. Patient-Motion Analysis in Perfusion Weighted MRI.
  2. Bag of Words and Bag of Parts models for Scene Classification in Images.
  3. Analysis of Stroke on Brain Computed Tomography Scans
  4. Multi-modal Semantic Indexing for Image Retrieval
  • Start
  • Prev
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • Next
  • End
  1. You are here:  
  2. Home
  3. Research
  4. Thesis
  5. Thesis Students
Bootstrap is a front-end framework of Twitter, Inc. Code licensed under MIT License. Font Awesome font licensed under SIL OFL 1.1.