CVIT Home CVIT Home
  • Home
  • People
    • Faculty
    • Staff
    • PhD Students
    • MS Students
    • Alumni
    • Post-doctoral
    • Honours Student
  • Research
    • Publications
    • Journals
    • Books
    • MS Thesis
    • PhD Thesis
    • Projects
    • Resources
  • Events
    • Talks and Visits
    • Major Events
    • Visitors
    • Summer Schools
  • Gallery
  • News & Updates
    • News
    • Blog
    • Newsletter
    • Past Announcements
  • Contact Us
  • Login

Graph-Spectral Techniques For Analyzing Resting State Functional Neuroimaging Data


Srinivas Govinda Surampudi

Abstract

Human brain is undoubtedly the most magnificent yet delicate arrangement of tissues. This serves as the seat of such a wide span of cognitive functions and behaviors. The neuronal activities within every neuron, collectively observed over network(s) of these interconnected neurons, manifest themselves into patterns at multiple scales of observations. Many brain imaging techniques such as fMRI, EEG, MEG etc. measure these patterns as electro-magnetic responses. These patterns supposedly play the role of unique neuronal signatures of the vast repertoire of cognitive functions. Experimentally, it is observed that different neuronal populations participate coherently to generate a signature for a cognitive function. These signatures could be investigated at the micro-scale corresponding to responses of individual neurons to external-current stimuli, at the meso-scale related to populations of neurons that show similar metabolic activities and in turn these populations, also known as regions of interest (ROIs), communicate via complex arrangement of anatomical fiber pathways leading to signatures at the macro-scale. The holy grail of neuroscience is thus to computationally decipher the interplay of this complex anatomical network and the complex functional patterns corresponding to the cognitive behaviors at various scales/levels. Each scale of observation, depending on the instruments of measurement, has its own rich spatiotemporal dynamics that interacts with higher and lower levels in complex ways. Large-scale anatomical fiber pathways are represented in a matrix that accounts for inter-population fiber strength known as structural connectivity (SC) matrix. One of the popular modalities to capture large-scale functional dynamics is resting-state fMRI, and statistical dependence between these inter-population BOLD signals is captured in functional connectivity (FC) matrix. There are many models that provide computational accounts for the relationship between these two matrices as deciphering this relationship will provide the mechanism by which cognitive functions arise over the structure. On one hand, there are many non-linear dynamical models that describe the biological phenomenon well but are expensive and intractable. On the other hand there are linear models that compromise on the biological richness but are analytically feasible. This thesis is concerned with the analysis of the temporal dynamics of observed resting-state fMRI signals over the large-scale human cortex. We provide a model that has a bio-physical explanation as well as an analytical expression for FC given SC. Reaction-diffusion systems provide a computational framework for the emergence of excitatory-inhibitory activities at the populations as reactions and their interactions as diffusion over space and time. The spatio-temporal dynamics of the BOLD signal governed by this framework is constrained with respect to the anatomical connections thereby separating the spatial and temporal dynamics. Covariancematrix of this signal is estimated thus getting an estimate of the functional connectivity matrix. The covariance matrix or the BOLD signal in general is expressed in terms of the graph-diffusion-kernels thus forming an analytically elegant expression. Most importantly, the model for FC abstracts out biological details and works in the realm of spectral graph theoretic constructs providing the necessary ease for computational analysis. As this model learns the combination parameters of multiple diffusion kernels and kernels themselves, it is called Multiple Kernel Learning (MKL) model. Apart from superior quantitative performance, the model parameters may act as biomarkers for various cognitive studies. Albeit, the model parameters are learned for a cohort, the model preserves subject-specificity. These parameters can be used as a measure for inter-group differences and dissimilarity identification as has been employed for age-group identification as an example in this thesis. Essentially MKL model partitions FC into two constituents: influence of the underlying anatomical structure into diffusion kernels and the cognitive theme of temporal structure into the model parameters, thus predicting FCs specific to subjects within the cognitive conditions of the cohort. Even though MKL is a cohort based model, it maintains sensitivity towards anatomy. Performance of the model drastically drops down with alterations in SC and model parameters, but does not overfit to the cohort. Resting state fMRI BOLD signals have been observed to show non-stationary dynamics. Such multiple spatio-temporal patterns, represented as dynamic FC matrices, are observed to be cyclically repeating in time motivating use of a generic clustering scheme to identify latent states of dynamics. We propose a novel solution that learns parameters specific to the dynamic states using a graph-theoretic model (temporal-Multiple Kernel Learning, tMKL) and finally predicts the grand average FC of the unseen subjects by leveraging a state transition Markov model. We discover the underlying lower-dimensional manifold of the temporal structure which is further parameterized as a set of local density distributions, or latent transient states. tMKL thus learns a mapping between anatomical graph and the temporal structure. Unlike MKL, tMKL model obeys state-specific optimization formulation and yet performs at par or better than MKL for predicting the grand average FC. Like MKL, tMKL also shows sensitivity towards subject-specific anatomy. Finally, both tMKL and MKL models outperform the state-of-the-art in their own ways by providing bio-physical insights.

 

Year of completion:  Sep 2018
 Advisor : Avinash Sharma And Dipanjan roy

Related Publications

  • Viral Parekh, Ramanathan Subramanian, Dipanjan Roy C.V. Jawahar - An EEG-based Image Annotation System - National Conference on Computer Vision Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2017 [PDF]


Downloads

thesis

Exploration of multiple imaging modalities for Glaucoma detection


Jahnavi Gamalapati S

Abstract

Glaucoma is an eye disease characterized by weakening of nerve cells often resulting in a permanent loss of vision. Glaucoma progression can occur without any physical indication to patients. Hence, early diagnosis of Glaucoma is recommended for preventing the permanent damage to vision. Early Glaucoma is often characterized by thinning of the Retinal Nerve Fiber Layer which is commonly called as RNFL defect (RNFLD). Computer-aided diagnosis (CAD) of eye diseases is popular and is based on automated analysis of fundus images. CAD solutions for diabetic retinopathy have reached more maturity than for glaucoma as the latter is more difficult to detect from fundus images. This is due to the fact that nerve damage appears in the form of subtle change in the background around optic disc. SD-OCT (Spectral Domain - Optical Coherence Tomography), a recently introduced modality, helps to capture 3D information of retina. Hence, it is more reliable for detecting nerve damage in retina compared with fundus imaging. However, a wide usage of OCT is limited due to cost per scan, time and ease of acquisition. This thesis focuses on integrating information from multiple modalities (OCT and fundus) for improving retinal nerve fibre layer defect or detection of RNFLD from fundus images. We examine two key problems in the context of CAD development: i) spatial alignment or registration of two modalities of imaging, namely, 2D fundus and 3D OCT volume images. This can pave way to integrate information across the modalities. Multimodal registration is challenging because of the varied Field of View and noise levels across the modalities. We propose a computationally efficient registration algorithm which is capable of handling complementary nature of modalities. Extensive qualitative and quantitative evaluations are performed to show the robustness of proposed method. ii) Detection of RNFLD from fundus images with good accuracy. We propose a novel CAD solution which utilises information from the 2 modalities for learning a model and uses it to predict the presence of RNFLD from fundus images. The problem is posed as learning from 2 modalities (fundus, OCT images) and predicting from only one (fundus images) with the other (OCT) as missing data. Our solution consists of a deep neural network architecture which learn modality independent representations. In the final part of the thesis we explore the scope of a new imaging modality angiography-Optical Coherence Tomography (A-OCT) in diagnosing Glaucoma. Two case studies are reported which help in understanding the progression of Retinal Nerve Fiber Layer thickness, Capillary Density in normaland glaucoma effected patients. The experiments on new modality has shown potential for considering it as a reliable biomarker along with existing modalities.

 

Year of completion:  July 2018
 Advisor : Jayanthi Sivaswamy

Related Publications

  • Tarannum Mansoori, JS Gamalapati and Jayanthi Sivaswamy - Radial Peripapillary Capillary Density Measurement Using Optical Coherence Tomography Angiography in Early Glaucoma Journal of glaucoma 26.5 (2017): 438-443. [PDF]

  • Pujitha Appan K, Jahnavi Gamalapati S and Jayanthi Sivaswamy - Detection of neovascularization in retinal images using semi-supervised learning Biomedical Imaging (ISBI 2017), 2017 IEEE 14th International Symposium on. IEEE, 2017. [PDF]

  • T Mansoori, JS Gamalapati and Jayanthi Sivaswamy - Measurement of radial peripapillary capillary density in the normal human retina using optical coherence tomography angiography Journal of glaucoma 26.3 (2017): 241-246. [PDF]


Downloads

thesis

Learning Deep and Compact Models for Gesture Recognition


Koustav Mullick

Abstract

The goal of gesture recognition is to interpret human gestures, that can originate from any bodily motion, but mainly confided to face or hand, and interact with a computer through it without physically touching it. It can be seen as a way for computers to begin to understand human body language, thus building a richer bridge between machines and humans. Many approaches have been made using cameras and computer vision algorithms for interpretation of sign language, identification and recognition of posture, gait, proxemics, and human behaviors. However effective gesture detection and classification can be quite a challenging task. Firstly, there can be a wide range of variations in the way the gestures are being performed. It needs to be generic and robust enough to handle variations in surrounding conditions, appearances, noise and individuals performing the gestures. Secondly, developing a model that can give real-time predictions and can be run on low-power devices having limited memory and processing capacity, is another challenge. Since deep learning models tend to have a large number of parameters, it not only has the disadvantage of not being able to fit into a mobile device because of the huge model size but also makes it difficult to utilize them for real-time inferencing. In this thesis we try to address both the above mentioned difficulties. We propose an end-to-end trainable model capable of learning both spatial and temporal features present in a gesture video directly from the raw video frames. It is achieved by combining the strengths of 3D-Convolutional Neural Networks and Long Short Term Memory variant of Recurrent Neural Networks. Further, we also explore ways to reduce the parameter space of such models without compromising a lot on performance. Particularly we look at two ways of obtaining compact models, with less number of parameters. Learn smaller models making use of the idea of knowledge distillation and reduce large models’ sizes by performing weight pruning. Our first contribution is learning the joined, end-to-end trainable, 3D-Convolutional Neural Network and Long Short Term Memory. Convolutional Neural Networks preserve both spatial and temporal information over the layers and can identify patterns over short durations. But the inputs need to be of fixed size, which may not always hold true in case of videos. Whereas, Long Short Term Memories face no difficulties in preserving information over longer duration and it can also work with variable length input sequences. However, they do not preserve patterns and hence works better when fed with features that already has learned some amount of spatio-temporal information, instead of just the raw pixel information. The joined model leverages the advantages of both of them. Experimentally we verify as well that, that indeed is the scenario, as our joined model outperforms the individual baseline models.Additionally the components can be pre-trained initially and later fine-tuned in a complete end-to-end fashion to further boost the network’s potential to capture information. We obtain almost state-of-the art result using our proposed model on the ChaLearn-2014 dataset for sign language recognition from videos, but using much simpler model and training mechanism compared to the best model. In our second contribution, we look into ways to learn compact models that enables us to perform real-time inferencing on hand-held devices where power and memory are constraints. To this extent we distill or transfer knowledge from a larger teacher network to a smaller student network. Without teacher supervision, the student network did not have enough capacity to perform well just using class-labels. We demonstrate this on the same ChaLearn-2014 dataset. To the best of our knowledge, this is the first work to explore knowledge distillation from teacher to student network in video classification task. We also show that training networks using Adam optimization technique, combined with weight decay, helps to obtaining sparser models by pruning weights. Training with Adam encourages a lot of weights to become very low by penalizing high weight values and adjusting the learning rate accordingly. Removing the low-valued weights helps to obtain sparser models, compared to SGD (with weight-decay as well) trained models. Experimental results on both gesture recognition task and image classification task on the CIFAR dataset validates the findings.

Year of completion:  July 2018
 Advisor : Anoop M Namboodiri

Related Publications

  • Koustav Mullick and Anoop M. Namboodiri - Learning Deep and Compact Models for Gesture Recognition 2017 IEEE International Conference on Image Processing, Beijing, China. [PDF]


Downloads

thesis

Gender Differences in Facial Emotion Perception for User Profiling via Implicit Behavioral Signals


Maneesh Bilalpur

Abstract

Understanding human emotions has been of research interests to multiple domains of modern day science namely Neuroscience, Psychology and Computer Science. The ultimate goals of each of these domains in studying them might be different such as neuroscientists interest in emotions is primarily to understand the structural and functional abilities of brain, psychologists study them to understand human interactions and computer scientists to design interfaces and automation of certain human-centric tasks. Several earlier works have suggested the existence of two facets to emotions namely perception and expression. It has been advised to study emotions in the aspects of perception and expression as separate entities. This work attempts to study the existence of gender differences in emotion perception(in specfic the Ekman emotions). Our work aims at utilizing such differences for user profiling, particularly in terms of gender and emotion Recognition. We employed implicit signals–the non-invasive electrical scalp activity of brain through Electroencepholography(EEG) and gaze patterns acquired through low-cost commercial devices to achieve these. We studied the impact of facial emotion intensity and facial regions in invoking the differences through stimuli involving of different intensities and masking face regions which were deemed important in previous studies. We expressly examined the implicit signals for their ecological validity. Existence of correlations between our study and previous studies from the above said domains in terms of Event Related Potentials(ERPs) and fixation distributions have added uniqueness and strength to our work. We achieved a reliable gender and emotion recognition with Support Vector Machine based classifiers and further designed a deep learning model to significantly outperform them. We also analyzed for emotion specific time windows and key electrodes for maximum gender recognition to arrive at some interesting conclusions. The appendix chapter on cross-visualization based cognitive workload classification using EEG attempts to quantify workload in order to evaluate user-interfaces. We employ four common yet unique data visualization methods to induce varying levels of workload through a standard n-back task and attempt to classify it across visualizations with deep learning through transfer learning. We compare its performance against the Proximal Support Vector Machines adopted in earlier works for within visualization workload classification.

Year of completion:  July 2018
 Advisor : Ramanathan Subramanian

Related Publications


    Downloads

    thesis

    Machine Learning for Source-code Plagiarism Detection


    Jitendra Yasaswi Bharadwaj katta

    Abstract

    This thesis presents a set of machine learning and deep learning approaches for building systems with the goal of source-code plagiarism detection. The task of plagiarism detection can be treated as assessing the amount of similarity presented within given entities. These entities can be anything like documents containing text, source-code etc. Plagiarism detection can be formulated as a fine-grained pattern classification problem. The detection process begins by transforming the entity into feature representations. These features are representatives of their corresponding entities in a discriminative high-dimensional space, where we can measure for similarity. Here, by entity we mean solution to programming assignments in typical computer science courses. The quality of the features determine the quality of detection As our first contribution, we propose a machine learning based approach for plagiarism detection in programming assignments using source-code metrics. Most of the well known plagiarism detectors either employ a text-based approach or use features based on the property of the program at a syntactic level. However, both these approaches succumb to code obfuscation which is a huge obstacle for automatic software plagiarism detection. Our proposed method uses source-code metrics as features, which are extracted from the intermediate representation of a program in a compiler infrastructure such as gcc. We demonstrate the use of unsupervised and supervised learning techniques on the extracted feature representations and show that our system is robust to code obfuscation. We validate our method on assignments from introductory programming course. The preliminary results show that our system is better when compared to other popular tools like MOSS. For visualizing the local and global structure of the features, we obtained the low-dimensional representations of our features using a popular technique called t-SNE, a variation of Stochastic Neighbor Embedding, which can preserve neighborhood identity in low-dimensions. Based on this idea of preserving neighborhood identity, we mine interesting information such as the diversity in student solution approaches to a given problem. The presence of well defined clusters in low-dimensional visualizations demonstrate that our features are capable of capturing interesting programming patterns. As our second contribution, we demonstrate how deep neural networks can be employed to learn features for source-code plagiarism detection. We employ a character-level Recurrent Neural Network (char- RNN ), a character-level language model to map the characters in a source-code to continuous-valued vectors called embeddings. We use these program embeddings as deep features for plagiarismdetection in programming assignments. Many popular plagiarism detection tools are based on n-gram techniques at syntactic level. However, these approaches to plagiarism detection fail to capture long term dependencies (non-contiguous interaction) present in the source-code. Contrarily, the proposed deep features capture non-contiguous interaction within n-grams. These are generic in nature and there is no need to fine-tune the char- RNN model again to program submissions from each individual problem-set. Our experiments show the effectiveness of deep features in the task of classifying assignment program submissions as copy, partial-copy and non-copy. As our final contribution, we demonstrate how to extract local deep features from source-code. We represent programs using local deep features and develop a framework to retrieve suspicious plagiarized cases for a given query program. Such representations are useful for identification of near-duplicate program pairs, where only a part of the program is copied or certain lines, blocks of code may be copied etc. In such cases, obtaining local feature representations for a program is more useful than representing a program with a single global feature. We develop a retrieval framework using Bag of Words (BoW) approach to retrieve susceptible plagiarized and partial-plagiarized (near-duplicate) cases for a given query program.

    Year of completion:  July 2018
     Advisor : Prof. C V Jawahar and Suresh Purini

    Related Publications

    • Jitendra Yasaswi, Suresh Purini and C. V. Jawahar -  Plagiarism detection in Programming Assignments Using Deep Features 4th Asian Conference on Pattern Recognition (ACPR 2017), Nanjing, China, 2017.[PDF]

    • Jitendra Yasaswi Bharadwaj katta, Srikailash G, Anil Chilupuri, Suresh Purini and C.V. Jawahar - Unsupervised Learning Based Approach for Plagiarism Detection in Programming Assignments ISEC. 2017. [PDF]


    Downloads

    thesis

    More Articles …

    1. Tackling Low Resolution for Better Scene Understanding
    2. Combining Class Taxonomies and Multi Task Learning To Regularize Fine-grained Recognition
    3. Cognitive Vision: Examining Attention, Engagement and Cognitive load via Gaze and EEG
    4. Exemplar based approaches on Face Fiducial Detection and Frontalization
    • Start
    • Prev
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • Next
    • End
    1. You are here:  
    2. Home
    3. Research
    4. MS Thesis
    5. Thesis Students
    Bootstrap is a front-end framework of Twitter, Inc. Code licensed under MIT License. Font Awesome font licensed under SIL OFL 1.1.