Talk Description
Institution: University of Melbourne - Victoria, Australia
Data quality improvement is one of the most important stages where data scientists spend their time as the accuracy of machine learning depends on the quality of the data. Traffic data are known to have quality issues due to sensor faults, system downtime, extreme weather conditions, and other circumstances. Handling missing values is one of the most common issues in traffic and transport applications. The objective of this study is to explore efficient and effective data imputation methods dealing with missing data and identify the most suitable method for real-time transportation/traffic applications. In this paper, we focus on non-domain specific approaches that only rely on minimal resources in terms of data, memory, and computational requirements, to generalize to different real-time applications.