Pedro Herruzo & Josep L. Larriba-Pey
Traffic forecasting based on data from infrastructure and citizens is an open and challenging task. Traffic environment in cities is dynamic and increasingly complex due to the growing population and mobility demand. In addition, sharing data from vehicles or personal devices requires preserving privacy. In this paper, we introduce a new approach to predict traffic volume, speed, and direction, in an aggregated way of traffic data presented as videos.
We tackle traffic forecasting as a scene completion task along time. We re-shape the network as a sequence to sequence problem and use a U-Net like architecture. The model accepts an input sequence iteratively and embeds the frames into a low dimensional space with an encoder. Then a recurrent encoder accumulates the temporal information of the input sequence into a single representation, and the recurrent decoder gives the embedded predictions. These predictions are recovered back to the original space by a decoder that uses skip connections from each layer of the encoder. The model uses two loss functions providing good predictions in the embedded space and high definition images in the original space. Exogenous variables like weather, time and calendar are also added in the model. We introduce a sampling approach for sequences that ensures diversity when creating batches, running in parallel to the optimization process.
Proceedings of the NeurIPS 2019 Competition and Demonstration Track, PMLR 123:47-55, 2020-08-19