Publications
Journal Publications
  • Siddharth Jain, Shyamgopal Karthik, and Vineet Gandhi -   Simplifying Knowledge Transfer in Pretrained Models, In Transactions on Machine Learning Research (TMLR), 2025 [ PDF ]

  • Moneish Kumar, Vineet Gandhi, Rémi Ronfard and Michael Gleicher - Zooming On All Actors: Automatic Focus+Context Split Screen Video Generation at Eurographics 2017 [PDF]

  • Moneish Kumar, Vineet Gandhi, Remi Ronfard, and Michael Gleicher - Zooming On All Actors: Automatic Focus+ Context Split Screen Video Generation Eurographics. Vol. 36. No. 2. 2017. [PDF]

  • Vineet Gandhi - Pano2Vid: Automatic Cinematography for Watching 360◦ Videos Eurographics Workshop on Intelligent Cinematography and Editing (2017). [PDF ]

  • Rahul Anand Sharma, Vineet Gandhi, Visesh Chari and C. V. Jawahar - Automatic Analysis of Broadcast Football Videos Using Contextual Priors Signal, Image and Video Processing (SIVP 2016), Volume 10, Issue 5, July, 2016. [PDF]


Books and Books Chapter

    Conference Publications

    • Aishwarya Agarwal, Srikrishna Karanam, and Vineet Gandhi -   Concept Regions Matter: Benchmarking CLIP with a New Cluster-Importance Approach, In Conference on Computer Vision and Pattern Recognition (CVPR), 2026 [ PDF ]

    • Neil Shah, Shirish Karande, and Vineet Gandhi NAM-to-Speech Conversion with Multitask-Enhanced Autoregressive Models, In Interspeech, 2025 [ PDF ]

    • Girmaji Rohit,Siddharth Jain,Bhav Beri,Sarthak Bansal, and Vineet Gandhi -   Minimalistic Video Saliency Prediction via Efficient Decoder & Spatio Temporal Action Cues, In International Conference on Acoustics, Speech, and Signal Processing, ICASSP, 2025 [ PDF ]

    • Girmaji Rohit,Bhav Beri,Ramanathan Subramanian, and Vineet Gandhi -   EditIQ: Automated Cinematic Editing of Static Wide-Angle Videos via Dialogue Interpretation and Saliency Cues , International Conference on Intelligent User Interfaces, IUI, 2025 [ PDF ]

    • Neilkumar Milankumar Shah,Shirish Karande, and Vineet Gandhi -   Advancing NAM-to-Speech Conversion with Novel Methods and the MultiNAM Dataset, In International Conference on Acoustics, Speech, and Signal Processing, ICASSP, 2025 [ PDF ]

    • Neilkumar Milankumar Shah,Ayan Kashyap,Shirish Karande, and Vineet Gandhi MRI2Speech: Speech Synthesis from Articulatory Movements Recorded by Real-time MRI, In International Conference on Acoustics, Speech, and Signal Processing, ICASSP, 2025 [ PDF ]

    • Ayan Kashyap,Neilkumar Milankumar Shah, and Vineet Gandhi -   Prompt-to-Correct: Automated Test-Time Pronunciation Correction with Voice Prompts, In International Conference on Acoustics, Speech, and Signal Processing, ICASSP, 2025 [ PDF ]

    • Aishwarya Agarwal,Srikrishna Karanam, and Vineet GandhiTIDE: Training Locally Interpretable Domain Generalization Models Enables Test-time Correction, In Computer Vision and Pattern Recognition, CVPR, 2025 [ PDF ]

    • Darshana S,Naresh Manwani, and Vineet Gandhi Pseudo-labelling meets Label Smoothing for Noisy Partial Label Learning, In Computer Vision and Pattern Recognition Conference workshops, CVPR-W, 2025 [ PDF ]

    • Darshana S, Makarand Tapaswi, and Vineet Gandhi -   Investigating Mechanisms for In-Context Vision Language Binding, In Computer Vision and Pattern Recognition Conference workshops (CVPR-W), 2025 [ PDF ]

    • Darshana S, Varun Gupta, Darshan Singh S, Zeeshan Khan, Vineet Gandhi, and Makarand Tapaswi VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment, In Computer Vision and Pattern Recognition (CVPR), 2025 [ PDF ]

    • K Saiteja, Neil Kumar Shah, Vishal Tambrahalli, Neha S and Vineet Gandhi ParrotTTS: Text-to-Speech Synthesis by Exploiting Self-Supervised Representations, In EACL, 2024 [ PDF ]

    • Achary Sudheer, Girmaji Rohit, Adhiraj Anil Deshmukh, and Vineet Gandhi Real Time GAZED: Online Shot Selection and Editing of Virtual Cameras from Wide-Angle Monocular Video Recordings , In WACV 2024 [ PDF ]

    • Shyamgopal Karthik, Ameya Prabhu, Puneet K. Dokania and Vineet Gandhi - No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks,  The Ninth International Conference on Learning Representations (ICLR '2021) 2021 [PDF]

    • Shyamgopal Karthik , Abhinav Moudgil  and Vineet Gandhi  - Exploring 3 R’s of Long-term Tracking: Re-detection, Recovery and Reliability , Winter Conference on Applications of Computer Vision (WACV 2020). [PDF]

    • Navyasri Reddy, Samyak Jain, Pradeep Yarlagadda and Vineet Gandhi  - Tidying Deep Saliency Prediction Architectures , International Conference on Intelligent Robots and Systems (IROS 2020) [PDF]

    • Aasheesh Singh, Aditya Kamireddypalli, Vineet Gandhi  and K Madhava Krishna - LiDAR Guided Small Obstacle Segmentation, International Conference on Intelligent Robots and Systems (IROS 2020) [PDF]

    • K L Bhanu Moorthy, Moneish Kumar, Ramanathan Subramanian, and Vineet Gandhi - GAZED– Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings , Conference on Human Factors in Computing Systems (ACM CHI 2020) [PDF]

    • Sriram N N,, Tirth Maniar, Jayaganesh Kalyanasundaram, Vineet Gandhi and Madhava Krishna Talk to the Vehicle: Language Conditioned Autonomous Navigation of Self Driving Cars International Conference on Robotics and Automation (IROS'2019) 2019 [PDF]

    • Syed Ashar Javed, Shreyas Saxena and Vineet Gandhi  Learning Unsupervised Visual Grounding Through Semantic Self-Supervision 28th International Joint Conference on Artificial Intelligence (IJCAI '2019) 2019 [PDF]

    • Aryaman Gupta, Kalpit Thakkar and Vineet Gandhi  and P J Narayanan Nose, Eyes and Ears: Head Pose Estimation by Locating Facial KeypointsConference on Acoustics, Speech and Signal Processing (ICASSP'2019) 2019 [PDF]

    • Gupta Krishnam, Javed Syed Asha, Vineet Gandhi and Krishna Madhava K. - MergeNet: A Deep Net Architecture for Small Obstacle Discovery The International Conference on Robotics and Automation (ICRA 2018), Brisbane, Convention and Exhibition Centre [PDF]

    • Shah Vatsal and Vineet Gandhi - An Iterative Approach for Shadow Removal in Document Images International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018), Calgary, Alberta, Canada. [PDF]

    • Rai Pranjal Kumar, Maheshwari Sajal and Vineet Gandhi - Document Quality Estimation using Spatial Frequency Response International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018), Calgary, Alberta, Canada. [PDF]

    • Kumar Kranthi, Kumar Moneish, Vineet Gandhi and Subramanian Ramanathan - Watch to Edit: Video Retargeting using Gaze The 39th Eurographics conference (Eurographics 2018), Delft, The Netherlands [PDF]

    • Rahul Anand Sharma, Bharath Bhat, Vineet Gandh and C.V.Jawahar - Automated Top View Registration of Broadcast Football Videos IEEE Winter Conference on Applications of Computer Vision (WACV 2018), Lake Tahoe, CA, USA, 2018. [PDF]

    • Pranjal Kumar Rai, Sajal Maheshwari, Ishit Mehta, Parikshit Sakurikar and Vineet GandhiBeyond OCRs for Document Blur Estimation 14th IAPR International Conference on Document Analysis and Recognition (ICDAR-2017), Kyoto, Japan. [PDF]

    • Sajal Maheshwari, Pranjal Kumar Rai, Gopal Sharma, and Vineet Gandhi - Document Blur Detection using Edge Profile Mining, Proceedings of the Tenth Indian Conference on Computer Vision, Graphics and Image Processing. ACM, (2016). [PDF]

    • Remi Ronfard, Benoit Encelle, Nicolas Sauret, Pierre-Antoine Champin, Thomas Steiner, Vineet Gandhi, Cyrille Mignio and Florent Thiery - Capturing and Indexing Rehearsals: The Design and Usage of a Digital Archive of Performing Arts Digital Heritage, 2015. Vol. 2. IEEE, (2015).[PDF]

    • Vineet Gandhi and Remi Ronfard - A Computational Framework for Vertical Video Editing, 4th Workshop on Intelligent Camera Control, Cinematography and Editing. (2015). [PDF]

    • Vineet Gandhi, Remi Ronfard,  and Michael Gleicher - Multi-Clip Video Editing from a Single Viewpoint Proceedings of the 11th European Conference on Visual Media Production. ACM, (2014). [PDF]

    • Vineet Gandhi, and Remi Ronfard - Detecting and Naming Actors in Movies using Generative Appearance Models Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013. [PDF]

    • Vineet Gandhi, Michael Gleicher and  Remi Ronfard - High-Resolution Depth Maps Based on TOF-Stereo Fusion Robotics and Automation (ICRA), 2012 IEEE International Conference on. IEEE, (2012). [PDF]


    Arxiv and Technical Report

    • Kanishk Jain and Vineet Gandhi  Comprehensive Multi-Modal Interactions for Referring Image Segmentation   In arXiv 2021 [PDF]

    • Samyak Jain, Pradeep Yarlagadda, Shreyank Jyoti, Shyamgopal Karthik , Ramanathan Subramanian and and Vineet Gandhi  ViNet: Pushing the limits of Visual Modality forAudio-Visual Saliency Prediction In axiv 2020 [PDF]

    • Sarath Sivaprasad, Ankur Singh, Naresh Manwani and and Vineet Gandhi  The Curious Case of Convex Neural Networks In axiv 2020 [PDF]

    • Shyamgopal Karthik, Ameya Prabhu and Vineet Gandhi  Simple Unsupervised Multi-Object Tracking in arXiv 2020 [PDF]

    • Sudheer Achary, K L Bhanu Moorthy, Ashar Javed, Nikita Shravan, Vineet Gandhi and Anoop Namboodiri CineFilter: Unsupervised Filtering for Real Time Autonomous Camera Systems in Arxiv 2019 [PDF]

    • Moudgil Abhinav and Vineet Gandhi - Long-Term Visual Object Tracking Benchmark arxiv 2017 [PDF]

    • Remi Ronfard, Vineet Gandhi, and Laurent Boiron -  The Prose Storyboard Language A Tool for Annotating and Directing Movies:&nbsparXiv preprint arXiv:1508.07593 (2015) [PDF]