CVIT Home CVIT Home
  • Home
  • People
    • Faculty
    • Staff
    • PhD Students
    • MS Students
    • Alumni
    • Post-doctoral
    • Honours Student
  • Research
    • Publications
    • Journals
    • Books
    • MS Thesis
    • PhD Thesis
    • Projects
    • Resources
  • Events
    • Talks and Visits
    • Major Events
    • Visitors
    • Summer Schools
  • Gallery
  • News & Updates
    • News
    • Blog
    • Newsletter
    • Past Announcements
  • Contact Us

Nerve Block Target Localization and Needle Guidance for Autonomous Robotic Ultrasound Guided Regional Anesthesia


ABHISHEK TYAGI

Abstract

Ultrasound guided regional anesthesia (UGRA) involves approaching target nerves through a needle in real-time, enabling precise deposition of drug with increased success rates and fewer complications. Development of autonomous robotic systems capable of administering UGRA is desirable for remote settings and localities where anesthesiologists are unavailable. Real-time segmentation of nerves, needle tip localization and needle trajectory extrapolation are required for developing such a system. In the first part of this thesis, we developed models to localize nerves in the ultrasound domain using a large dataset. Our prospective study enrolled 227 subjects who were systematically scanned for brachial plexus nerves in various settings using three different ultrasound machines to create a dataset of 227 unique videos. In total, 41,000 video frames were annotated by experienced anaesthesiologists using partial automation with object tracking and active contour algorithms. Four baseline neural network models were trained on the dataset and their performance was evaluated for object detection and segmentation tasks. Generalizability of the best suited model was then tested on the datasets constructed from separate ultrasound scanners with and without fine-tuning. The results demonstrate that deep learning models can be leveraged for real time segmentation of brachial plexus in neck ultrasonography videos with high accuracy and reliability. Using these nerve segmentation predictions, we define automated anesthesia needle targets by fitting an ellipse to the nerve contours. The second part of this thesis focuses on localization of the needles and development of a framework to guide the needles toward their targets. For the segmentation of the needle, a natural RGB pre-trained neural network is first fine-tuned on a large ultrasound dataset for domain transfer and then adapted for the needle using a small dataset. The segmented needle’s trajectory angle is calculated using Radon transformation and the trajectory is extrapolated from the needle tip. The intersection of extrapolated trajectory with the needle target guides the needle navigation for drug delivery. The needle trajectory’s average angle error was 2 o , average error in trajectory’s distance from center of the image was 10 pixels (2 mm) and the average error in needle tip was 19 pixels (3.8 mm) which is within acceptable range of 5 mm as per experienced anesthesiologists. The entire dataset has been released publicly for further study by the research community.

Year of completion:  November 2023
 Advisor : Jayanthi Sivaswamy

Related Publications


    Downloads

    thesis

    Data exploration, Playing styles, and Gameplay for Cooperative Partially Observable games: Pictionary as a case study


    Kiruthika Kannan

    Abstract

    Cooperative human-human communication becomes challenging when restrictions such as difference in communication modality and limited time are imposed. In this thesis, we present the popular cooperative social game Pictionary as an online multimodal test bed to explore the dynamics of humanhuman interactions in such settings. Pictionary is a multiplayer game where the players attempt to convey a word or phrase through drawing. The restriction imposed on the mode of communication gives rise to intriguing diversity and creativity in the players’ responses. To explore the player activity in Pictionary, an online browser-based Pictionary application is developed and utilized to collect a Pictionary dataset. We conduct an exploratory analysis of the dataset, examining the data across three domains: global session-related statistics, target word-related statistics, and user-related statistics. We also present our interactive dashboard to visualize the analysis results. We identify attributes of player interactions that characterize cooperative gameplay. Using these attributes, we find stable role-specific playing style components independent of game difficulty. In terms of gameplay and the larger context of cooperative partially observable communication, our results suggest that too much interaction or unbalanced interaction negatively impacts game success. Additionally, the playing style components discovered via our analysis align with select player personality types proposed in existing frameworks for multiplayer games. Furthermore, this thesis explores atypical sketch content within the Pictionary dataset. We present various baseline models for detecting such atypical content. We conduct a comparative analysis of three baseline models, namely BiLSTM+CRF, SketchsegNet+, and modified CRAFT. Results indicate that the image segmentation-based deep neural network outperforms recurrent models that rely on stroke features or stroke coordinates as input.

    Year of completion:  November 2023
     Advisor : Ravi Kiran Sarvadevabhatla

    Related Publications


      Downloads

      thesis

      Security from uncertainty: Designing privacy-preserving verification methods using Noise


      Praguna Manvi

      Abstract

      Biometric authentication plays an increasingly prominent role in today’s products and services for verifying an individual’s identity. It is not only efficient but also practical, as it establishes a unique link to an individual through their physical and behavioral characteristics [43]. Unlike conventional authentication mechanisms like passwords or documents, biometric traits are inherent to each individual, eliminating the need to memorize additional information [64]. However, the security and privacy of biometric templates used in authentication remain primary concerns, as biometric data is strongly and irrevocably tied to an individual, as emphasized in the article [42]. In the context of remote authentication, Secure Multiparty Computation (SMC) offers a powerful solution. SMC enables two parties to interactively compute a function using their private inputs without disclosing any information except for the output itself [19]. This approach ensures that biometric template comparison is carried out in a privacy-preserving manner, enhancing both security and privacy in authentication services. In this thesis, we introduce a unique approach to iris, fingerprint, and face verification by incorporating ”noise” into the authentication process. In our work,“noise” refers to signals obtained from non-discriminatory or unreliable regions of biometric characteristics. Our extensive empirical evaluation reveals a correlation among noise features, and we leverage this correlation in a novel Secure Two-Party Computation (STPC) design. This STPC design operates on quantified uncertainty between noise features, providing informationtheoretic security. Our approach has low accuracy degradations, practical computational complexity, wide applicability making it suitable for practical real-time applications.

      Year of completion:  December 2023
       Advisor : Anoop M Namboodiri

      Related Publications


        Downloads

        thesis

        Towards Enhancing Semantic Segmentation in Resource Constrained Settings


        Ashutosh Mishra

        Abstract

        Understanding the semantics of the scene to automate the decision process for self-driving cars completely is becoming a crucial task to solve in computer vision. Due to the recent progress in the state of autonomous driving, added with a lot of semantic segmentation datasets for road scene understanding being proposed, semantic segmentation of road scenes has recently evolved to be an important problem to tackle. But training semantic segmentation models becomes a resource-intensive task since it requires multi-GPU training and therefore becomes the bottleneck to reproducing results for better understanding quickly. This thesis introduces challenges and provides solutions to reduce the training time of segmentation models by introducing two small-scale datasets. Additionally, the thesis explores the potential of employing neural architecture search and automatic pruning techniques to create efficient segmentation modules in resource-constrained settings. Chapter2 of the thesis introduces the problem of semantic segmentation and discusses some deep learning approaches to solve supervised semantic segmentation. We briefly discuss the different metrics used and also touch upon the statistics of various datasets that are available in the literature to train semantic segmentation models. Chapter 3 of the thesis explains the need of having a dataset based on the Indian road scenario. Most of the datasets in the literature are captured in Western settings having well-defined traffic participants, delineated boundaries, etc, which seldom mold in the Indian setting. We describe the annotation pipeline, along with the quality check framework used to annotate the dataset. Now, though the IDD dataset [121] caters to the Indian setting, this dataset is still quite resource intensive in terms of GPU computation. Hence, there is a need to have a small resolution, less label-sized dataset for rapid prototyping. We introduce our proposed datasets and provide a detailed set of experiments, and statistical comparisons with the existing datasets to substantiate our claim regarding the usefulness of the proposed solution. We also show through experiments that the models trained using our datasets can be deployed on low-resource hardware such as Raspberry Pi. At the end of this chapter, we also look into the significance of the proposed datasets in facilitating challenges at two prominent conferences: the International Conference on Computer Vision (ICCV) and the National Conference on Pattern Recognition, Image Processing, and Graphics (NCVPRIPG) in 2019. These challenges aimed to address semantic segmentation in resource-constrained settings, inviting innovative architectures capable of achieving decent accuracy on these proposed datasets. We also discuss the potential application of these datasets in teaching semantic segmentation through a course of notebooks introducing traditional as well as deep learning-based methods to perform segmentation. These notebooks are plug-and-play, where the first three notebooks can run on laptop CPU, while the fourth notebook requires GPU access.

        Year of completion:  January 2024
         Advisor : C V Jawahar,Girish Varma

        Related Publications


          Downloads

          thesis

          High-Quality 3D Fingerprint Generation: Merging Skin Optics, Machine Learning and 3D Reconstruction Techniques


          Apoorva Srivastava

          Abstract

          Fingerprints are a widely recognized and commonly used method of identification. Contact-based fingerprints, which involve pressing the finger against a surface to obtain images, are a popular method of capturing fingerprints. However, this process has several drawbacks, including skin deformation, unhygienic conditions, and high sensitivity to the moisture content of the finger. These factors can negatively impact the accuracy of the fingerprint. Moreover, fingerprints are three-dimensional anatomical structures, and two-dimensional fingerprints do not capture the depth information of the finger ridges. While 3D fingerprint capture is less sensitive to skin moisture levels and avoids skin deformation, it is limited in adoption due to the high cost and system complexity associated with it. The complexity and cost are mainly attributed to the use of multiple cameras, projectors, and sometimes synchronously moving mechanical parts. Photometric stereo offers a promising solution to build low-cost, simple sensors for high-quality 3D capture using only a single camera and a few LEDs. However, the method assumes that the surface being imaged is lambertian, which is not the case for human fingers. Existing 3D fingerprint scanners based on photometric stereo also assume that the finger is lambertian, resulting in poor reconstruction results. In this context, we introduce the Split and Knit algorithm (SnK), a 3D reconstruction pipeline based on Photometric Stereo for finger surfaces. The algorithm splits the reconstruction of the ridge-valley pattern and finger shape and combines them to obtain the 3D fingerprint reconstruction for the full finger with a single camera for the first time. To reconstruct the ridge-valley pattern, SnK introduces an efficient way of estimating the direct illumination component by using a trained U-Net without extra hardware, which reduces the non-Lambertian nature of the finger image and enables a higher-quality reconstruction of the entire finger surface. To obtain the finger shape using a single camera, the algorithm introduced two novel approaches, a) using IR illumination and b) using a mirror and parametric modeling for the finger shape. Finally, we combine the overall finger shape and the ridge-valley point cloud to obtain a 3D finger phalange. The high-quality 3D reconstruction results in better matching accuracy of the captured fingerprints. Splitting the ridge-valley pattern from the finger provides an implicit way to convert 3D fingerprint into 2D fingerprint, making the SnK algorithm compatible with the 2D fingerprint recognition systems. To apply the SnK algorithm to fingerprints, we designed a 3D printed photometric stereo-based setup that captures contactless finger images and obtains their 3D reconstructions

          Year of completion:  August 2023
           Advisor : Anoop M Namboodiri

          Related Publications


            Downloads

            thesis

            More Articles …

            1. A Holistic Framework for Multimodal Ecosystem of Pictionary
            2. Deploying Multi Camera Multi Player Detection and Tracking Systems in Top View
            3. Towards building controllable Text to Speech systems
            4. Effective and Efficient Attribute-aware Open-set Face Verification
            • Start
            • Prev
            • 4
            • 5
            • 6
            • 7
            • 8
            • 9
            • 10
            • 11
            • 12
            • 13
            • Next
            • End
            1. You are here:  
            2. Home
            3. Research
            4. MS Thesis
            5. Thesis Students
            Bootstrap is a front-end framework of Twitter, Inc. Code licensed under MIT License. Font Awesome font licensed under SIL OFL 1.1.