:: Scene Text Understanding
CVIT

Motivation


                         

Scene text recognition has gained significant attention from the computer vision community in recent years. Often images contain text which gives rich and useful information about their content. Recognizing such text is a challenging problem, even more so than the recognition of scanned documents. Given the rapid growth of camera-based applications readily available on mobile phones, understanding scene text is more important than ever. One could, for instance, foresee an application to answer questions such as, “What does this sign say?”. This is related to the problem of Optical Character Recognition (OCR), which has a long history in the computer vision community. However, the success of OCR systems is largely restricted to text from scanned documents. Scene text exhibits a large variability in appearances, and can prove to be challenging even for the state-of-the-art OCR methods. Many scene understanding methods recognize objects and regions like roads, trees, sky etc in the image successfully, but tend to ignore the text on the sign board. Our goal is to fill this gap in understanding the scene.


Cropped Word Recognition

Paper
Anand Mishra, Karteek Alahari and C. V. Jawahar, Top-down and Bottom-up cues for Scene Text Recognition, IEEE CVPR 2012. [pdf][Abstract][poster][bibtex]

Anand Mishra, Karteek Alahari and C. V. Jawahar, Scene Text Recognition using Higher Order Language Priors, BMVC 2012. [pdf][Abstract][bibtex]

Downloads:
SVT-CHAR:      README

IIIT 5K-word: Available now


Scene Text Binarization
   
   
  • Binarization as a labelling problem
  • Foreground and background colours are modelled using GMMs
  • Iterative Graph Cut based algorithm to find pixel accurate binarization
  • Improvement in pixel level as well as OCR accuracy

Paper
Anand Mishra, Karteek Alahari and C. V. Jawahar, An MRF model for Binarization of Natural Scene Texts, ICDAR 2011. [pdf][Abstract] [Slides] [bibtex]



Related Publications

1. Anand Mishra, Karteek Alahari and C. V. Jawahar, Scene Text Recognition using Higher Order Language Priors, BMVC 2012 (Oral). [pdf][Abstract][bibtex].

2. Anand Mishra, Karteek Alahari and C.V. Jawahar, Top-down and Bottom-up cues for Scene Text Recognition, IEEE CVPR 2012. [pdf][Abstract][poster] [bibtex]

3. Anand Mishra, Karteek Alahari and C.V. Jawahar, An MRF model for Binarization of Natural Scene Texts, ICDAR 2011 (Oral). [pdf][Abstract] [Slides] [bibtex]




People

Anand Mishra
Karteek Alahari
C. V. Jawahar

Last Modified: Oct 27, 2012