From 2025
Year wise list: 2025 | 2024 | 2023 | 2022 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013 |

Learning Co-speech Gesture Representations

Sindhu B. Hegde

 

Date : 11/04/2025

 

Abstract:

Humans gesture when they speak -- gesturing is an integral part of non-verbal communication. Yet, large-scale understanding of co-speech gestures remains relatively underexplored. In this talk, I will delve into different approaches for learning co-speech gesture representations, highlight key challenges, and outline promising directions to advance gesture understanding in real-world, multimodal settings.

Bio:

Sindhu Hegde is a second-year PhD student in the Visual Geometry Group (VGG) at the University of Oxford, supervised by Prof. Andrew Zisserman.  Her research is in Computer Vision, particularly in multimodal learning, video understanding, and self-supervised learning. Prior to joining Oxford,  she worked as a Lead Data Scientist @ Verisk Analytics. Before that, she pursued a Master’s by Research (MS) at Centre for Visual Information Technology (CVIT),  IIIT Hyderabad, supervised by Prof. C V Jawahar (IIIT-H) and Prof. Vinay Namboodiri (University of Bath, UK). Her Master’s research focused on exploiting  the redundancies in vision and speech modalities for cross-modal generation.


Sounds of Pouring

Piyush

Piyush Bagad

 

Date : 09/04/25

 

Abstract:

What can possibly be scientifically interesting about such a mundane chore as pouring a liquid into a glass? We perform this action all the time but barely realise that we effortlessly learn to infer several useful physical properties in the process. For example, evidence in psychoacoustics suggests that humans can accurately infer the level of the liquid, the time to fill, the size of the container, and even the temperature of the liquid merely from the sound of liquid. How do we do it? What is the physics behind pouring? How can we use it to train an audio model to predict some of these physical properties solely from the sound of pouring? I will answer these questions in the talk.

Bio:

Piyush is a PhD student at the VGG lab in Oxford. He is supervised by Prof. Andrew Zisserman. His interests lie in time-sensitive multi-modal video understanding. Previously, he did his Master’s in AI at the University of Amsterdam. He has also worked as a Research Fellow at Wadhwani AI in Mumbai. 

 

Model Compression (Pruning and Quantization Strategies)

Srinivas

Dr. Srinivas Rana

 

Date : 02/04/2025

 

Abstract:

Model compression, focusing specifically on pruning and quantization strategies, is an essential topic in machine learning. As machine learning models become more complex and resource-demanding, exploring ways to make them more efficient without sacrificing their performance is crucial. Pruning and quantization are two key techniques that help reduce the size of models, improve inference speed, and lower resource consumption, making them more suitable for deployment on edge devices or in environments with limited computational power.

Bio:

Dr. Srinivas Rana is currently a Senior ML Scientist at Wadhwani AI, where he leads a portfolio of healthcare solutions pertaining to screening or diagnostics across diverse domains. Previously, he was working in the UK with different organizations in the field of medical devices and life sciences. He holds a PhD from IIT Madras.