Michael Schmitt, Seyed Ali Ahmadi, Yonghao Xu, Gülşen Taşkın, Ujjwal Verma, Francescopaolo Sica, and Ronny Hänsch

Factors affecting the size and volume of a dataset.

An illustration of the proposed measures to characterize datasets.

Carefully curated and annotated datasets are the foundation of machine learning (ML), with particularly data-hungry deep neural networks forming the core of what is often called artificial intelligence ( AI ). Due to the massive success of deep learning (DL) applied to Earth observation (EO) problems, the focus of the community has been largely on the development of evermore sophisticated deep neural network architectures and training strategies. For that purpose, numerous task-specific datasets have been created that were largely ignored by previously published review articles on AI for EO. With this article, we want to change the perspective and put ML datasets dedicated to EO data and applications into the spotlight. Based on a review of historical developments, currently available resources are described and a perspective for future developments is formed. We hope to contribute to an understanding that the nature of our data is what distinguishes the EO community from many other communities that apply DL techniques to image data, and that a detailed understanding of EO data peculiarities is among the core competencies of our discipline.

IEEE Geoscience and Remote Sensing Magazine, 11, 3, 63-97, 2023-08-09.

View paper
IARAI Authors
Yonghao Xu
Earth Observation
Benchmark Dataset, Deep Learning, Earth Observation, Machine Learning, Remote Sensing


Imprint | Privacy Policy

Stay in the know with developments at IARAI

We can let you know if there’s any

updates from the Institute.
You can later also tailor your news feed to specific research areas or keywords (Privacy)

Log in with your credentials

Forgot your details?

Create Account