Traffic4cast 2019 inaugurates an ongoing competition series, where the data scale and analysis challenges are extended every year. The competition is related to:
- Traffic forecasting: Our challenge stands in a long tradition of work analyzing data and predicting aspects related to traffic [7]. The competition that we now propose firstly stands out in terms of the scale of the data provided, matching the increased complexity of the predictive task. The challenge for the first time exploits a novel format in which we represent traffic data that naturally preserves the privacy of individual traffic participants.
- Video frame prediction: Technically we can present traffic data over time as a movie, and traffic forecasts can thus be considered as frame prediction. Despite notable activities in this area, often deploying variational autoencoders (VAE) [30, 32, 16], or generative adversarial networks (GAN) [23, 29, 18] predicting future frames of videos remains a challenging problem due to the complexity and motion dynamics of natural scenes [18], especially for longer prediction horizons [16]. Frame sequences with an underlying geographical process rather than moving natural objects in common video scenes thus present an additional exciting challenge to established methods for video frame prediction in this competition.
Let us consider this in the context of prior work, including related earlier competitions:
- RTA Freeway Travel Time (2010) Prediction
- Task: Travel time prediction on Sydney’s M4 freeway for different time horizons up to 24 hours ahead
- Data: historical travel times collected from static loop sensors
- TomTom Traffic Prediction for Intelligent GPS Navigation (2010)
- Tasks: Traffic congestion prediction, traffic jam modeling
- Data: simulated data from the Traffic Simulation Framework (TSF) of the Uni- versity of Warsaw
- Transportation Forecasting Competition (TRANSFOR 19) (2019)
- Tasks: Average speed prediction 5 minutes ahead
- Data: GPS-trajectories on the Second Ring Road of Xi’An City
In addition, several competitions employed GPS-trajectory data for prediction tasks. These, however, focused less on modelling traffic states but rather on the prediction of journey destinations, travel times (‘ETA’), or taxi fares. Further- more, there are a range of public GPS-datasets available, such as taxi trajectories collected in Rome, San Francisco, or Shanghai, and Microsoft’s GeoLife dataset. In general, however, these data sets are collected from only a few hundred probe vehicles in a single metropolitan area, and typically cover only few weeks or months. Conversely, the GeoLife data set only follows a small set of 182 individuals.
Our competition is thus innovative on several levels:
- The scale of the data provided: In contrast to previous competitions and published data sets, we provide
- Large-scale data that covers multiple full cities instead of individual road segments or single cities,
- Real-world data reflecting actual observations collected by a large fleet of probe vehicles, rather than synthetic simulations
- More densely sampled data through larger fleets, giving a better estimate of traffic properties throughout town
- Multi-level periodic or seasonal effects, from intra-day and intra-week to longer term changes, e.g., summer vs winter
This gives a unique comprehensive longitudinal view of traffic states and their evolution over the course of shorter and longer times across multiple metropolitan areas with markedly different cultural and social backgrounds that will be reflected in the prevailing traffic patterns. Overall, the data that we will share with the scientific community is based on the unprecedented number of probe points.
- The ambition and complexity of the prediction challenge: The competition task is to predict not only one but multiple attributes of traffic state simultaneously – specifically speed, volume, and heading. Moreover, cells are characterized by both the attribute averages and the data distributions in detail, while preserving the privacy of individual traffic participants.
- The detail and privacy preserving representation and encoding of traffic data: Introducing a novel approach to modelling traffic states, we provide traffic data in an aggregated, privacy preserving form that was compiled from individual real-world high-resolution GPS-trajectories. Recent societal and legal developments, as reflected for instance in the EU General Data Protection Regulation (GDPR) will increase the demand for analytic methods able to extract information from such aggregate data rather than always requiring precise and therefore highly sensitive movement data of private individuals.
In summary, while building on a tradition of traffic modelling, Traffic4cast clearly goes beyond and above the state of the art in providing a much richer data basis. This allows and requires more complex analyses of the high-resolution yet aggregated, privacy-preserving data for the modelling and prediction of traffic states and their evolution at different time scales and multiple locations around the globe.
We challenge the community to explore and engage in a novel way of traffic forecasting, aggregating high-resolution trajectories in map grid cells and time bins that preserve much detail in characterizing traffic states while preserving individual privacy.