Role of Scene Text Understanding in Enhancing Driver Assistance
George Tom
Abstract
Scene text conveys important information to drivers and pedestrians, serving as an indispensable means of communication in various environments. Scene text contains information regarding the speed limit, route information, rest stops, and exits, among other important information. It is important for drivers and passengers to understand this information for a safe and efficient journey. However, outdoor scenes are cluttered with text, distracting drivers and making it hard to focus on what matters, potentially compromising their ability to focus on essential details and navigate safely. Recognising scene text in motion aggravates this challenge, as textual cues transiently appear and necessitate early detection at a distance. Driving scenarios introduce additional complexities, including occlusions, motion blur, perspective distortions, and varying text sizes, further complicating scene text understanding.
In this thesis, we look at improving scene text understanding in diving scenarios through video question answering and analyzing the present state of scene text detection, recognition, and tracking: (i) We introduce new video questions answering tasks and datasets that require an understanding of text and road signs in driving videos to answer the questions. (ii) We look at the current state of scene text detection, tracking, and recognition in the driving domain through the RoadText-1K competition. (iii) We explore detection and recognition in special cases of occlusions, a common yet under-explored complication in real-world driving scenarios. By focusing on these areas, the thesis contributes to advancing scene text analysis methodologies, offering insights and solutions that are imperative for developing more intelligent and responsive driver assistance systems.
Year of completion: | November 2024 |
Advisor : | Prof. C V Jawahar |
Prof. Dimosthenis Karatzas |
Related Publications
Downloads