Handwritten Word Recognition for Indic & Latin scripts using Deep CNN-RNN Hybrid Networks

 


Kartik Dutta

Abstract

Handwriting recognition (HWR) in Indic scripts is a challenging problem due to the inherent subtleties in the scripts, cursive nature of the handwriting and similar shape of the characters. Though a lot of research has been done in the field of text recognition, the focus of the vision community has been primarily on English. Furthermore, a lack of publicly available handwriting datasets in Indic scripts has also affected the development of handwritten word recognizers and made direct comparisons across different methods an impossible task in the field. Also, due to this lack annotated data, it becomes challenging to train deep neural networks which contain millions of parameters. These facts are quite surprising considering the fact that there are over 400 million Devanagari speakers in India alone. We first tackle the problem of lack of annotated data using various approaches. We describe a framework for annotating large scale of handwritten word images without the need for manual segmentation and label re-alignment. Two new word level handwritten datasets for Telugu and Devanagari are released which were created using the above mentioned framework. We synthesize synthetic datasets containing millions of realistic images with a large vocabulary for the purpose of pre-training using publicly available Unicode fonts. Later on, pre-training using data from Latin script is also shown to be useful to overcome the shortage of data.Capitalizing on the success of the CNN-RNN Hybrid architecture, we propose various improvements in the architecture and it’s training pipeline to make it even more robust for the purposes of handwriting recognition. We now change the network to use a Resnet-18 like structure for the convolutional part along with adding a spatial transformer network layer. We also use an extensive data augmentation scheme involving multi-scale, elastic, affine and test time distortion. We outperform the previous state-of-the-art methods on existing benchmark datasets for both Latin and Indic scripts by quite some margin. We perform an ablation study to empirically show how the various changes we made to the original CNN-RNN Hybrid network have improved its performance with respect to handwriting recognition. We dive deeper into the working of our networks convolutional layers and verify the robustness of convolutional-features through layer visualizations. We hope the release of the two datasets mentioned in this work along with the architecture and training techniques that we have used instill interest among fellow researchers of the field.

 

Year of completion:  Mar 2019
 Advisor : C V Jawahar

Related Publications

  • Kartik Dutta, Praveen Krishnan, Minesh Mathew and C.V. Jawahar - Improving CNN-RNN Hybrid Networks for Handwriting Recognition The 16th International Conference on Frontiers in Handwriting Recognition,Niagara Falls, USA [PDF]

  • Kartik Dutta, Praveen Krishnan, Minesh Mathew and C.V. Jawahar - Towards Spotting and Recognition of Handwritten Words in Indic Scripts The 16th International Conference on Frontiers in Handwriting Recognition,Niagara Falls, USA [PDF]

  • Kartik Dutta, Praveen Krishnan, Minesh Mathew and C.V. Jawahar - Localizing and Recognizing Text in Lecture Videos The 16th International Conference on Frontiers in Handwriting Recognition,Niagara Falls, USA [PDF]

  • Praveen Krishnan, Kartik Dutta and C. V. Jawahar - Word Spotting and Recognition using Deep Embedding, Proceedings of the 13th IAPR International Workshop on Document Analysis Systems, 24-27 April 2018, Vienna, Austria. [PDF]

  • Kartik Dutta,Praveen Krishnan, Minesh Mathew and C. V. Jawahar - Offline Handwriting Recognition on Devanagari using a new Benchmark Dataset, Proceedings of the 13th IAPR International Workshop on Document Analysis Systems, 24-27 April 2018, Vienna, Austria. [PDF]

  • Kartik Dutta, Praveen Krishnan, Minesh Mathew, and C. V. JawaharTowards Accurate Handwritten Word Recognition for Hindi and Bangla National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2017 [PDF]


Downloads

thesis