Driving into the Dataverse: Real and Synthetic Data for Autonomous Vehicles

Shubham Dokania


The rapid advancement of autonomous driving systems is transforming the future of transportation and urban mobility. However, these systems face significant challenges when deployed in complex and unstructured traffic environments, such as those found in many cities in south-east Asian countries like India. This thesis aims to address the challenges related to data collection, management, and generation for autonomous driving systems operating in such scenarios. The primary contributions of this work include the development of a data collection toolkit, the creation of a comprehensive driving dataset for unstructured environments (IDD-3D), and the proposal of a synthetic data generation framework (TRoVE) based on real-world data. The data collection toolkit presented in this thesis enables sensor fusion for driving scenarios by treating sensor APIs as separate entities from the data collection interface. This framework allows for the use of any sensor configuration and demonstrates the smooth creation of driving datasets using our framework. This toolkit is adaptable to different environments and can be easily scaled, making it a crucial step towards creating a large-scale dataset for Indian road scenarios. The IDD-3D dataset provides a valuable resource for studying unstructured driving scenarios with complex road situations. We present a thorough statistical and experimental analysis of the dataset, which includes high-quality annotations for 3D object bounding boxes and instance IDs for tracking. We highlight the diverse object types, categories, and complex trajectories found in Indian road scenes, enabling the development of robust autonomous driving systems that can generalize across different geographical locations. In addition, we provide benchmarks for 3D object detection and tracking using state-of-the-art approaches. The TRoVE synthetic data toolkit offers a framework for the automatic generation of synthetic data for visual perception, leveraging existing real-world data. By combining synthetic data with real data, we show the potential for improved performance in various computer vision tasks. The data generation process can be extended to different locations and scenarios, avoiding the limitations of bounded data volumes and variety found in manually designed virtual environments. This thesis contributes to the development of systems in complex environments by addressing the challenges of data acquisition, management, and generation. By bridging the gap between the current state-of-the-art and the needs of unstructured traffic scenarios, this work paves the way for more robust and versatile intelligent transportation systems that can operate safely and efficiently in a wide range of situations

Year of completion:  July 2023
 Advisor : C V Jawahar

Related Publications