Datasets#

torchgeo-bench supports two generations of GeoBench datasets — V1 and V2 — plus a small wrapper around torchgeo’s standalone EuroSAT dataset for sanity checks. All datasets share the BenchDataset interface and are auto-registered on import so they can be selected by their CLI name.

Filesystem layout#

All data lives under ./data/ relative to the current working directory. Paths are fixed — the runner does not honour environment variables like GEOBENCH_ROOT; if you keep data elsewhere, symlink data/ to the real location.

Family	Default destination	Source
`geobench_v1`	`data/classification_v1.0/`	Hugging Face `recursix/geo-bench-1.0`
`geobench_v2`	`data/geobenchv2/<name>/`	Hugging Face `aialliance/<name>`
`eurosat`	`data/eurosat/`	torchgeo’s `EuroSAT` downloader

Downloading#

The bundled Command-line interface provides one subcommand per family:

$ torchgeo-bench download geobench_v1                              # full V1 bundle
$ torchgeo-bench download geobench_v2                              # default V2 set
$ torchgeo-bench download geobench_v2 --datasets benv2,burn_scars  # V2 subset
$ torchgeo-bench download eurosat                                  # torchgeo EuroSAT
$ torchgeo-bench download geobench_v2 --output-dir /scratch/data   # custom root

The default V2 download set is: benv2, burn_scars, caffe, cloudsen12, dynamic_earthnet, flair2, forestnet, fotw, kuro_siwo, pastis, so2sat, spacenet2, spacenet7, treesatai.

GeoBench V1 — classification#

V1 datasets use the m- prefix on the command line.

The first time a V1 dataset is requested without a local copy under data/classification_v1.0 or data/classification_v1.0_wds, the loader auto-downloads the requested dataset from the public mirror isaaccorley/geobenchv1-webdataset on the Hugging Face Hub. Set GEOBENCH_V1_NO_HF_DOWNLOAD=1 to disable the auto-download and force a local-only path (torchgeo-bench download geobench_v1 still works for the legacy per-sample HDF5 layout).

CLI name	#cls	bands	multilabel	sensor	Class
`m-bigearthnet`	43	12	yes	Sentinel-2	`MBigEarthNet`
`m-brick-kiln`	2	13	no	Sentinel-2	`MBrickKiln`
`m-eurosat`	10	13	no	Sentinel-2	`MEurosat`
`m-forestnet`	12	6	no	Landsat	`MForestnet`
`m-pv4ger`	2	3	no	Aerial RGB	`MPv4ger`
`m-so2sat`	17	18	no	Sentinel-1 + S2	`MSo2Sat`

Multi-label datasets (m-bigearthnet) report the micro_mAP metric instead of accuracy.

GeoBench V2 — classification#

CLI name	#cls	bands	multilabel	sensor	Class
`benv2`	19	14	yes	Sentinel-1 + Sentinel-2 (multi-modal)	`BENV2`
`forestnet`	12	6	no	Sentinel-2	`Forestnet`
`so2sat`	17	12	no	Sentinel-1 + Sentinel-2	`So2Sat`
`treesatai`	13	19	yes	Aerial + S2 + S1 (multi-modal)	`TreeSatAI`

V2 datasets are stored as a single .tortilla file each, hosted under aialliance/<name> on the Hugging Face Hub. _V2Dataset.get_dataset passes download=True to the upstream class, so a missing tortilla is fetched on first use — no separate torchgeo-bench download step required for the sweep.

Note

V1 and V2 share several CLI names but the underlying data are different datasets — different sensors, splits, label spaces, or normalisation conventions:

m-bigearthnet (V1) — original BigEarthNet, 43-class multilabel scene tags, S2 top-of-atmosphere DN. benv2 (V2) — BigEarthNet v2.0 with 19-class multilabel and stacked S1 + S2.
m-so2sat (V1) — 18-band stack including LCZ context. so2sat (V2) — 10 S2 + 2 S1 bands stored at reflectance scale.
m-forestnet (V1) — Landsat-8 6-band uint8. forestnet (V2) — Sentinel-2 6-band uint8 with a different split than V1 despite the same channel count.

The leaderboard prefixes panel titles with GeoBench V1 / GeoBench V2 so V1 and V2 results are never compared on the same axis.

GeoBench V2 — segmentation#

CLI name	#cls	bands	notes	Class
`burn_scars`	3	6		`BurnScars`
`caffe`	4	1	aerial grayscale	`CaFFe`
`cloudsen12`	4	12		`CloudSEN12`
`dynamic_earthnet`	7	16		`DynamicEarthNet`
`flair2`	13	5	aerial + Sentinel-2	`FLAIR2`
`fotw`	4	4	Fields of the World	`FieldsOfTheWorld`
`kuro_siwo`	4	3	SAR `vv` / `vh` + DEM (no RGB triplet)	`KuroSiwo`
`pastis`	20	16	Sentinel-2 + Sentinel-1 (multi-modal)	`PASTIS`
`spacenet2`	3	9	WorldView 8-band + pan	`SpaceNet2`
`spacenet7`	3	3		`SpaceNet7`

Other#

CLI name	Class
`eurosat`	`EuroSAT` (torchgeo wrapper, random split)
`eurosat-spatial`	`EuroSATSpatial` (longitude-based split)

Selecting datasets#

Pass a single dataset, a comma-separated list, or all to evaluate every registered dataset:

$ torchgeo-bench run dataset.names=[m-eurosat]
$ torchgeo-bench run dataset.names=[burn_scars,pastis,flair2]
$ torchgeo-bench run dataset.names=all

Bands selection#

Each dataset declares an ordered list of BandSpec objects. Three modes are supported:

dataset.bands=rgb (default) — only the bands listed in rgb_bands.
dataset.bands=all — every band the dataset exposes.
dataset.bands=[red,green,blue,nir] — an explicit subset.

The runner derives num_channels from the loaded tensor and constructs the matching list[BandSpec] so the model wrapper can size its input layer and per-channel normalization correctly. The selected bands value is recorded in the results CSV so multiple runs writing to the same file (and resume=true) keep RGB and multispectral results distinguishable.

$ # All 13 Sentinel-2 bands on EuroSAT with a pretrained timm ResNet-18
$ torchgeo-bench run model=timm/resnet18 dataset.names=[m-eurosat] dataset.bands=all

Multi-modality (V2)#

Several V2 datasets are multi-sensor (e.g. treesatai = aerial + S2 + S1, pastis = S2 + S1, kuro_siwo = SAR + DEM). Their wrappers set band_order_strategy = "by_sensor" and the V2 base class groups BandSpec entries by sensor before passing them to the upstream geobench_v2 loader. End users do not need to do anything special — set dataset.bands=all (or an explicit subset) and the right per-modality tensors are concatenated into a single image key.

Model compatibility#

timm wrappers rebuild the input conv for any num_channels.
RCFBench and ImageStatsBench are band-agnostic.
The torchgeo RGB-only wrappers hold fixed-channel pretrained weights and don’t currently adapt to non-RGB inputs — see #16.
TorchGeoDOFABench accepts variable channels via wavelength tokens but the current wrapper hard-codes Sentinel-2 RGB wavelengths — see #15.

Data partitions (V1 only)#

V1 datasets honour the dataset.partition argument (which selects one of the partition JSON files distributed with each dataset). V2 datasets ignore it.

$ # Train on 1% of the V1 training split, write to a separate CSV
$ torchgeo-bench run dataset.partition=0.01x_train output=results/1pct.csv

Common partition values: default, 0.01x_train, 0.02x_train, 0.05x_train, 0.10x_train, 0.20x_train, 0.50x_train, 1.00x_train. The exact set available depends on which partition JSON files ship with the dataset.