Towards Understanding Small Objects in Indian Driving Situations

Umamahesh Janapareddi

Abstract

In Indian urban and rural driving scenarios, small objects are pervasive and often crucial for safe navigation. These objects can include pedestrians crossing roads, children playing near streets, cyclists, stray animals, as well as small vehicles like scooters and motorbikes. Additionally, traffic signs, signal lights, potholes, and road markings (such as lane dividers or zebra crossings) are often small in size but essential for driving decisions. In such contexts, missing or inaccurately segmenting these small objects can lead to critical errors in detection, causing accidents or delays in the vehicle’s decision-making process. Automated understanding of such objects need detection and segmentation to start with.

Semantic Segmentation is a critical task in computer vision with a wide range of applications. The objective is to partition an image—a collection of pixels—into distinct labeled regions, each corresponding to specific objects or parts of the scene. This process is crucial for scene understanding and enables the localization of objects within the image. Over time, significant progress has been made in semantic segmentation, especially with the advent of deep learning. The advances in this area have revolutionized computer vision, pushing beyond traditional methods and achieving remarkable improvements in performance.

When discussing semantic segmentation, we often focus on datasets, the objects within those datasets, and their corresponding segmentations. While many datasets exist for road scenarios, particularly those representing Western road conditions, there is relatively little research on road conditions specific to India. One notable exception is the Indian Driving Dataset (IDD), a dataset specifically designed for semantic segmentation of Indian road scenarios.

Road and driving datasets typically contain objects of varying sizes within each class label. These objects can be broadly categorized into three types: small, medium, and large. The importance of segmentation is well understood across several domains such as medical imaging, autonomous vehicles, aerial imagery, robotics, surveillance, and industrial automation. However, one of the most challenging problems in segmentation is the segmentation of small objects. Small object segmentation is particularly difficult due to factors such as (i) the limited number of pixels representing small objects, (ii) class imbalance during training, and (iii) the inherent challenges posed by small object representations. These factors hinder the performance of deep learning architectures, making it harder for modern techniques to accurately handle small objects.

Year of completion:	March 2025
Advisor :	Jawahar C V