Models#

This page is the operator-facing tour of the model backbones bundled with torchgeo-bench: which presets exist, how to invoke them, and how to add a new one. For the abstract base class and the full class reference, see torchgeo_bench.models.

Available presets#

Every preset under src/torchgeo_bench/conf/model/ becomes a model=… selector for the run subcommand. A preset’s _target_ field resolves to a class re-exported from torchgeo_bench.models.

Random Convolutional Features (RCF)#

RCFBench. Gaussian or empirical random features in the spirit of MOSAIKS.

$ torchgeo-bench run model=rcf
$ torchgeo-bench run model=rcf model.mode=empirical model.features=1024

Image statistics baseline#

ImageStatsBench. A trivial baseline that returns per-channel mean / std as the feature vector.

$ torchgeo-bench run model=imagestats

timm — ImageNet-pretrained CNNs and ViTs#

TimmPatchBenchModel. Configs under src/torchgeo_bench/conf/model/timm/ cover ResNet, ConvNeXt, EfficientNet, DenseNet, RegNet, MobileNetV3, VGG, MaxViT, and more. ViT / DeiT / Swin variants live under timm/vit/.

$ torchgeo-bench run model=timm/resnet50
$ torchgeo-bench run model=timm/convnext_base dataset.names=[m-eurosat]
$ torchgeo-bench run model=timm/vit/vit_base_patch16_224 dataset.image_size=224
$ torchgeo-bench run model=timm/vit/swin_base_patch4_window7_224 eval.skip_linear=true

ViT-style backbones expect a fixed spatial resolution. Set dataset.image_size=224 (bilinear by default; switch to bicubic / nearest via dataset.interpolation) to resize the dataset tiles for any model.

timm models rebuild their input convolution for any number of channels — they work with dataset.bands=all out of the box (pretrained 3-channel weights are averaged / replicated as needed).

torchgeo foundation models#

Configs under src/torchgeo_bench/conf/model/torchgeo/. Most are RGB-only self-supervised checkpoints from torchgeo’s model hub.

$ # Sentinel-2 RGB SSL
$ torchgeo-bench run model=torchgeo/resnet50_s2rgb_moco
$ torchgeo-bench run model=torchgeo/resnet18_s2rgb_seco
$ torchgeo-bench run model=torchgeo/resnet50_fmow_gassl

$ # ScaleMAE on fMoW RGB
$ torchgeo-bench run model=torchgeo/scalemae_large_fmow

$ # DOFA — band-agnostic (currently configured for Sentinel-2 RGB wavelengths)
$ torchgeo-bench run model=torchgeo/dofa_base

$ # Satlas Swin-V2 (NAIP / Sentinel-2 RGB)
$ torchgeo-bench run model=torchgeo/swinv2b_naip_satlas_mi
$ torchgeo-bench run model=torchgeo/swinv2b_s2rgb_satlas_mi

$ # EarthLoc place-recognition descriptor
$ torchgeo-bench run model=torchgeo/earthloc_s2_resnet50

OlmoEarth (AI2)#

OlmoEarthBenchModel. Requires the optional olmoearth extra:

$ pip install 'torchgeo-bench[olmoearth]'

$ # OlmoEarth v1 (Nano / Tiny / Base / Large)
$ torchgeo-bench run model=olmoearth_nano
$ torchgeo-bench run model=olmoearth_base
$ torchgeo-bench run model=olmoearth_large dataset.bands=all

$ # OlmoEarth v1.1 (Nano / Tiny / Base)
$ torchgeo-bench run model=olmoearth_v1_1_nano
$ torchgeo-bench run model=olmoearth_v1_1_tiny
$ torchgeo-bench run model=olmoearth_v1_1_base

OlmoEarth v1.1 uses a linear patch embedding (vs. convolutional in v1), a single bandset per modality, and updated masking/loss functions, yielding a ≈ 3× reduction in MACs with comparable accuracy. The version parameter selects the weight family:

Config

Version

Size

Notes

olmoearth_nano

v1

Nano

multi-bandset, conv patch embed

olmoearth_tiny

v1

Tiny

olmoearth_base

v1

Base

olmoearth_large

v1

Large

olmoearth_v1_1_nano

v1.1

Nano

single-bandset, linear patch embed

olmoearth_v1_1_tiny

v1.1

Tiny

olmoearth_v1_1_base

v1.1

Base

Note

Input normalization is selected globally with dataset.normalization (default bandspec_zscore). Each model receives that strategy through BenchModel; use model_native for wrappers that declare pretrained input units / statistics, or identity when a backbone owns all normalization internally.

For datasets using Landsat imagery (e.g. m-forestnet), all OlmoEarth configs route Landsat through the Sentinel-2 normalizer via sensor_remap: {landsat: landsat_as_s2}; this is required because GeoBench delivers Landsat as uint8 [0, 255] and the Landsat normalizer expects a different dynamic range.

SAM 3 vision encoder#

SAM3Encoder. Requires the optional sam3 extra and a local checkpoint at checkpoints/sam3/:

$ pip install 'torchgeo-bench[sam3]'
$ torchgeo-bench run model=sam3_encoder dataset.bands=[red,green,blue]

Adding a new model#

There are two contribution pathways. Stage 1 lets you benchmark your model locally and report results in a paper without opening a PR. Stage 2 covers the full code contribution: exporting the class, writing tests, hosting weights, and submitting a PR.

See also

Evaluate your own model (Stage 1)

Stage 1 — evaluate your model locally and report results.

Contribute a model (Stage 2)

Stage 2 — contribute the model as a PR to the shared benchmark.

Note

Two key patterns apply regardless of stage:

  • Do not put bands in the Hydra YAML. The runner reads the current dataset’s BandSpec list and injects it into the constructor automatically. Adding bands to the YAML causes a TypeError (duplicate keyword argument).

  • Pass normalization="identity" to super().__init__ when your backbone handles normalization internally (e.g. OlmoEarth, Clay, any model whose forward() runs its own per-channel standardization). The sealed forward_patch_features will then pass raw sensor values straight to your _forward_patch_features without applying any additional z-score.

For segmentation models, also pick the eval.segmentation.layers that the head will hook into — see Segmentation backbone layer reference for verified values per timm backbone family.