Talk Description
Institution: Institute of Communication and Computer Systems (ICCS) of the National Technical University of Athens (NTUA) - Attiki, Athens, Greece
Applying machine learning (ML) methods in autonomous vehicles requires vast amounts of training data that must be representative of the real-world. However, collecting such data can be expensive, time-consuming, and sometimes impossible due to safety concerns. Simulations thus offer a solution to this challenge by providing a safe and cost-effective way to generate large amounts of training data. Still, generating simulation data presents its own unique challenges that must be addressed. Within the latest European project we are participating in, we have identified the overarching main challenges to be, the diversity of the produced data and the accuracy of the utilized simulation models.
Lack of diversity of the training data can lead to overfitting and poor generalization of ML models. To address this challenge, we explore various approaches, from the extension and enrichment of existing data augmentation techniques (e.g., perspective transformations, color/blurring manipulation) to more recent techniques like domain adaptation, utilizing generative models able to translate synthetic data to realistic-looking ones or transform/adapt existing images to specific characteristics (e.g., sunny to cloudy, day to night).
The soundness and consistency of the simulation models are essential requirements for ensuring that they reflect to a large extent and for the intended purpose the real-world environment. To achieve that, incorporating accurate and well-thought physical models, sensor models, and environmental conditions into the simulations is paramount. This requires extensive knowledge of the domain, and it is crucial and as such, we have fostered within the project, the collaboration between domain experts and ML researchers.