Avijit and his co-writers presented their paper “Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives” at CVPR 2024. Ego-Exo4D is a diverse large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously captured egocentric and exocentric videos of skilled human activities (e.g. sports music dance bike repair). 740 participants from 13 cities worldwide performed these activities in 123 different natural scene contexts yielding long-form captures from 1 to 42 minutes each and 1286 hours of video combined. The multimodal nature of the dataset is unprecedented: the video is accompanied by multichannel audio eye gaze 3D point clouds camera poses IMU and multiple paired language descriptions---including a novel "expert commentary" done by coaches and teachers and tailored to the skilled-activity domain. 

Along with Avijith, there were students from Professors Makarand, PJ Narayan, Avinash, and Jawahar’s labs, as well as a couple of alumni who now work in major tech companies like Amazon and Microsoft. The event was filled with paper presentations, workshops, and industry booths, making it a treasure trove of knowledge and networking opportunities. He explains how the paper presentation went on for an hour and a half which he co-presented with Siddhant and team.

avijit2

During the conference, Avijit also attended the CVIT alumni meet with around 15-20 former CVIT members, many of whom now work at companies like Amazon, Microsoft, Walmart, and Meta. He had the chance to reunite with friends, including Siddhant, and spend time with seniors who had moved to Seattle. The alumni gathered at a park in downtown Seattle and later went out for dinner.

Avijit received valuable feedback on his research from various professors including Professor Jitendra Malik and Professor Dima Damen, who helped refine his problem formulation. He found the conference environment to be welcoming, with senior researchers being approachable and open to discussions. 

Avijit observed differences in research culture across countries. In India, PhD students often work long hours in the lab, while in Europe, there is a stronger emphasis on work-life balance, with students working set hours. The US research culture varies, with some similarities to India in terms of work intensity. He also noted that advisor-student relationships tend to be more similar to back here in campus, open to discussions and debates.

Before CVPR 2024, Avijit attended a medical imaging conference in Melbourne in 2017 during his master's studies. Looking ahead, he aims to complete his PhD and submit his thesis by next year. While he is currently considering a career in industry, he remains open to future opportunities in academia.

Avijit reflected on his journey at CVIT under Professor Jawahar. Avijit joined as an intern in 2016 and 2017 before joining his PhD. Initially, he found it challenging to align with his advisor's high-level discussions, but over time, he adapted and now finds their interactions productive. He has also traveled with Professor Jawahar before, including a trip to New York for a Meta project in 2022.

He enjoys living in Hyderabad and finds the campus and hostel facilities a great place for research students during their initial years. Initially stayed on campus, moved out during COVID-19, then returned after getting married. Housing policies have changed, and PhD students are now required to vacate staff quarters after five years.

Campus facilities have improved from the time he has been there and despite some setbacks, he believes the institute still has better computing resources than most Indian universities. As he focused on preparing to submit a paper for the conference, he found CVPR 2024 to be an enriching experience, both professionally and personally. The conference provided valuable networking opportunities, research insights, and exposure to the global computer vision community.

https://www.youtube.com/watch?v=TTVoW289UoU&feature=youtu.be

In this video, Avijit gives an overview of his research with the CVIT lab in collaboration with Project Aria. His team is working on the Driver Intent Prediction project, a computer vision application for accident prediction.