LectureVideoDB - A dataset for text detection and Recognition in Lecture Videos

Abstract

Lecture videos are rich with textual information and to be able to understand the text is quite useful for larger video understanding/analysis applications. Though text recognition from images have been an active research area in computer vision, text in lecture videos has mostly been overlooked. In this work, we investigate the efficacy of state-of-the art handwritten and scene text recognition methods on text in lecture videos. To this end, a new dataset - LectureVideoDB compiled from frames from multiple lecture videos is introduced. Our experiments show that the existing methods do not fare well on the new dataset. The results necessitate the need to improvise the existing methods for robust performance on lecture videos.

Major Contributions

A new dataset comprising of over 5000 frames from lecture videos annotated for text detection and recognition.
The dataset is benchmarked using existing state-of-the art methods for scene text detection and scene text/ HWR Recognition.

Related Publications

Kartik Dutta, Minesh Mathew, Praveen Krishnan and CV Jawahar, Localizing and Recognizing Text in Lecture Videos, International Conference on Frontiers of Handwriting Recognition ( ICFHR) 2018.

Dataset:

LectureVideoDB

Bibtex

If you use this work or dataset, please cite :

 @inproceedings{lectureVideoDB2018,
    title={Localizing and Recognizing Text in Lecture Videos},
    author={Dutta, Kartik and Mathew, Minesh and Krishnan, Praveen and Jawahar, C.~V.},
    booktitle={ICFHR},
    year={2018}
}