Prediction through Description
In the last decade, big data and machine learning (ML) revolutions have changed the way we study our planet. A wealth of observation data are being collected by a plethora of sensors at unprecedented spatial, spectral, and temporal resolutions. The quantity and quality of measurement techniques have significantly increased. Hyperspectral imaging extends the optical imaging by measuring a large number of bands. Synthetic aperture radar (SAR) sensors provide valuable texture information by measuring radar backscatter. These techniques are complemented by light detection and ranging (LiDAR) and terrestrial laser scanning (TLS), which produce point clouds representing elevation as opposed to images. The number of observation platforms has also been growing, with a multitude of satellites in orbit, unmanned aerial vehicles (UAVs), ground sensors, and social media streams. The collected data are quickly rising in volume (hundreds of petabytes already), speed (estimated around 5 petabytes/year), variety (in sampling frequencies, spectral ranges, spatiotemporal scales and dimensionality), and uncertainty (from observational errors to conceptual inconsistencies). Data driven approaches, in particular ML, are a natural choice to extract and analyze information from these data deluge. ML models are now routinely used for classification of land cover types, modeling of land-atmosphere and ocean-atmosphere exchange of greenhouse gases, detection of anomalies and extreme events, and causal discovery.