LithoNet: A Benchmark Dataset for Machine Learning with Digital Outcrops
Sam Thiele, Ahmed J Afifi, Sandra Lorenz, Raimon Tolosana-Delgado, Moritz Kirsch, Pedram Ghamisi, and Richard Gloaguen
Deep learning techniques are increasingly used to automatically derive geological maps from digital outcrop models, lessening interpretation time and (ideally) reducing bias. Such techniques are especially needed when hyperspectral images are back-projected to create data-rich ‘hypercloud’ type digital outcrop models. However, accurate validation of these automated mapping approaches is a significant challenge, due to the subjective nature of geological mapping and difficulty collecting quantitative validation data. This makes validation of different machine learning approaches for geological applications exceedingly difficult. Furthermore, many state-of-the-art deep learning methods are limited to 2-D image data, making application to 3-D digital outcrops (e.g., hyperclouds) an outstanding challenge.
In this contribution we present LithoNet, a benchmark digital outcrop dataset designed to (1) quantitatively compare learning approaches for geological mapping, and (2) facilitate the development of new approaches that are compatible with non-structured 3-D data (i.e., point clouds). LithoNet comprises two halves: a set of real digital outcrop models acquired at Corta Atalaya (Spain), attributed with different spectral and ground-truth data, and a synthetic twin that uses latent features in the original datasets to reconstruct realistic spectral data (including sensor noise and processing artifacts) from the ground-truth. We have used these datasets to explore the abilities of different machine learning approaches for automated geological mapping. By making it public we hope to foster the development and adaptation of new machine learning tools.
EGU23-14007, 2023-02-22.