Daniel Klotz, Frederik Kratzert, Martin Gauch, Alden Keefe Sampson, Günter Klambauer, Sepp Hochreiter, and Grey Nearing
Machine learning algorithms perform well in a variety of environmental modeling tasks, but typically do not provide uncertainty estimates. Here, we benchmark several methods for uncertainty estimation in hydrology with deep learning. We employ previously developed rainfall-runoff model based on the Long Short-Term Memory (LSTM) network, with an additional hidden layer for flexibility.
We build benchmarks using data from the Catchment Attributes and Meteorology for Large-sample Studies (CAMELS) dataset. CAMELS is a standard public dataset containing daily meteorological data and geological attributes for basins across contiguous United States from 1980 through 2010. For comparing uncertainty estimates, we use reliability and resolution as metrics. Reliability, which measures consistency in the uncertainty estimates with respect to observations, is evaluated with the probability plots; resolution, which reflects the width of the distribution, is evaluated with a statistics set.
To make uncertainty predictions, we adapt three approaches based on Mixture Density Networks (MDNs) and one based on Monte Carlo Dropout (MCD). MDNs use a neural network to mix probability densities of components, which are simple distributions such as Gaussian. MCD is a regularization technique that randomly ignores specific network units during training. Our results show that MDNs, in particular with asymmetric Laplacian distributions, perform better than MCD. We demonstrate that deep learning models can produce statistically reliable uncertainty estimates.
Hydrology and Earth System Sciences, under review, 2021-03-15.