The Traffic4cast 2020 competition is based on high-resolution real-world traffic data provided by our data sponsor HERE Technologies. The core challenge data are derived from a large fleet of probe vehicles, live incidents feed, and static map data and are grouped into 100m x 100m x 5 min spatiotemporal cells. The core data contain dynamic features, encoding traffic volume, speed, direction, and incidents, and static features, describing the road junctions and points of interest, such as restaurants, shopping, parking, etc.
The goal of the bonus challenge is to predict air quality in three main cities: Berlin, Istanbul, and Moscow. As traffic substantially contributes to air pollution, the bonus challenge aims to explore the correlations in space and time between air pollution and traffic features. Participants are encouraged to investigate locations within the cities and time intervals that are affected the most by traffic, considering street characteristics, weather (temperature, wind, sunlight), and seasonal changes.
The space sensor data on air pollutants around the globe is provided by the ESA Copernicus Sentinel-5P project. The levels of ozone (O3), nitrogen dioxide (NO2), sulfur dioxide (SO2), methane (CH4), and formaldehyde (CH2O) are reported daily throughout the year. For each city, the data are divided spatially into a grid of 50 (latitude) x 44 (longitude) cells of approximately 1 x 1 km2. For each pollutant, the data represent a three-dimensional tensor with the first dimension spanning the days the satellite images were obtained (data are not available for some days as the sensors cannot measure through clouds). To compare air pollution data with the traffic data, the traffic tensors containing 495 x 436 spatial cells need to be downsampled. Go to our Forums to learn more about the data structure of the bonus challenge.