Indian Language OCR: Current Status, Challenges, and Future Directions

In Conjunction with ICVGIP 2025

Dec 17, 2025 , IIT Mandi

Get Started

Topic of the Workshop

This workshop will address the challenges and opportunities in developing Optical Character Recognition (OCR) systems for Indian languages, encompassing printed documents, handwritten manuscripts, and scene text. The diversity of scripts, writing styles, and document formats in India presents unique research problems that require innovative approaches. The workshop will present recent advances, datasets, tools, and applications, while also identifying open challenges and future research directions. Robust multilingual OCR is central to ongoing digitization efforts in governance, education, and industry, and is a critical enabler for the broader success of Artificial Intelligence (AI) and Large Language Models (LLMs) in the Indian context.


Keynote Speakers

Soumyadeep Dey
Soumyadeep Dey
Microsoft R&D
Bio of Speaker :

Soumyadeep Dey is currently working as Sr. Applied Scientist at Microsoft, India. He completed his Ph.D. from the Department of Computer Science & Engineering at IIT Kharagpur in 2019. His research interests include image processing, computer vision, document segmentation, and applications of machine learning techniques for resource-constrained devices.

Abstract :

Smartphones have become ubiquitous computing devices due to their mobility and ease of use. However, they present unique challenges for AI/ML because of limited processing power, memory, storage, battery life, and varied network connections, as well as data privacy concerns. This talk will discuss several challenges in developing AI/ML models for edge computing, focusing on applications such as document-image cleanup and segmentation.


Ravi Kiran Sarvadevabhatla
Ravi Kiran Sarvadevabhatla
IIIT Hyderabad
Bio of Speaker :

Dr. Ravi Kiran Sarvadevabhatla is an Associate Professor at IIIT Hyderabad and a core contributor to national AI initiatives, including leading Vision projects for the Government-funded BharatGen LLM. He holds an MS from the University of Washington, Seattle, and a Ph.D. from IISc (2018), where his thesis won the IUPRAI Best Thesis Award. His research interests lie at the intersection of AI, multi-modal multimedia data, Robotics, and Human-Computer Interaction. (For more details, visit: https://ravika.github.io/)

Abstract :

BharatGen is a government-funded, mission-mode initiative aimed at building sovereign, multimodal, and multilingual foundation models for India. The talk will briefly introduce BharatGen and present work on document foundation models, including the open-source release of Patram-7B, India’s first vision–language document foundation model. It will outline the capabilities enabled by such models for layout-aware and multilingual document understanding, and describe tools developed along the way for evaluation, visualization, and error analysis. The talk will then discuss applications currently being developed. It will also touch upon lessons from building practical vision–language systems, including scenarios where inputs extend beyond documents. The talk will conclude by highlighting capabilities unlocked by document foundation models and open research directions.


Organisers

Ajoy Mondal

Ajoy Mondal

IIIT Hyderabad

ajoy.mondal@iiit.ac.in

Anand Mishra

Anand Mishra

IIT Jodhpur

mishra@iitj.ac.in

Ankur Rana

Ankur Rana

Punjabi University

arana@pbi.ac.in

Chetan Arora

Chetan Arora

IIT Delhi

chetan@cse.iitd.ac.in

C.V.Jawahar

C.V.Jawahar

IIIT Hyderabad

jawahar@iiit.ac.in

Ganesh Ramakrishnan

Ganesh Ramakrishnan

IIT Bombay

ganesh@cse.iitb.ac.in

Gurupreet Singh

Gurupreet Singh Lehal

IIIT Hyderabad

gs.lehal@research.iiit.ac.in

Ravi Kiran

Ravi Kiran Sarvabhatla

IIIT Hyderabad

ravi.kiran@iiit.ac.in

Tushar Patnaik

Tushar Patnaik

CDAC Noida

tusharpatnaik@cdac.in

Venkatapathy Subramanian

Venkatapathy Subramanian

IIT Bombay

venkatapathy@cse.iitb.ac.in

Program Schedule

Time

Session

Speakers

9:00 - 9:15 Introduction and Welcome Anand Mishra
9:15 - 9:50 Keynote-1
AI/ML for mobile devices @ Microsoft-IDC
Soumyadeep Dey
Session Chair: Anand Mishra
9:50 - 10:50 Student Presentations Session-1, Session Chair: Ajoy Mondal
(4 presentations, 15 min each)
10:50 - 11:00 Poster Lightning Talks (5 presentations, 2 minutes each)
Session Chair: Anand Mishra
11:00 - 12:00 Coffee Break (with Poster/Demo Displays)
12:00 - 12:35 Keynote-2
Document Foundation Models and Applications at BharatGen
Ravi Kiran S
Session Chair: Chetan Arora
12:35 - 12:55 Student Presentations Session-2, Session Chair: Ajoy Mondal
(2 presentations, 10 min each)
12:55 - 1:00 Concluding Remarks C. V. Jawahar

Student Presentation Details

S. No

Presenter

Title

Track

Student Presentation - Session 1 : Session Chair - Dr. Ajoy Mondal
(Duration: 15 min / presentation)
1 Anik De IndicPhotoOCR Oral, Demo
2 Dikshant Sharma X-STAR: Cross-lingual Scene Text-aware Image Retrieval Oral, Demo
3 Evani Lalitha SemiHastakshar: Generalizable Indic Handwritten OCR through Semi-Supervised Learning Oral , Poster
4 Sahithi Kukkala IndicDLP : A Foundational Dataset for Multi-Lingual and Multi-Domain Document Layout Parsing Oral
Poster Lightning Talks ( Duration : 2 min / presentations)
1 Radha Krishna Deshpande Intelligent Search Engine Poster, Demo
2 Shashank Krishna Vempati Lipikar: A Unified Multilingual OCR Ecosystem for 22+ Indian Languages Demo
3 Pratyush Jena Unveiling Text in Challenging Stone Inscriptions: A Character-Context-Aware Patching Strategy for Binarization Poster
4 K Lokesh Recognition and Restoration of Historical Manuscript Poster
5 Shaon Bhattacharyya FormLens: From Ink to Insight with Adapting Vision-Language Models for Handwritten Form Digitization Poster, Demo
Student Presentation - Session 2: Session Chair - Dr. Ajoy Mondal
(Duration: 10 minutes each paper)
1 Saumya Mundra MIST: Multilingual Incidental Dataset for Scene Text Detection Oral
2 Avijit Dasgupta Are We There Yet? Exploring the Capabilities of MLLMs in Assistive AI Applications Oral

Contact

For any enquiry, please email us at ocr.nltm@research.iiit.ac.in