Models#
This page is the operator-facing tour of the model backbones bundled
with torchgeo-bench: which presets exist, how to invoke them, and
how to add a new one. For the abstract base class and the full class
reference, see torchgeo_bench.models.
Available presets#
Every preset under src/torchgeo_bench/conf/model/ becomes a
model=… selector for the run subcommand. A preset’s _target_
field resolves to a class re-exported from torchgeo_bench.models.
Random Convolutional Features (RCF)#
RCFBench. Gaussian or empirical random
features in the spirit of MOSAIKS.
$ torchgeo-bench run model=rcf
$ torchgeo-bench run model=rcf model.mode=empirical model.features=1024
Image statistics baseline#
ImageStatsBench. A trivial baseline that
returns per-channel mean / std as the feature vector.
$ torchgeo-bench run model=imagestats
timm — ImageNet-pretrained CNNs and ViTs#
TimmPatchBenchModel. Configs under
src/torchgeo_bench/conf/model/timm/ cover ResNet, ConvNeXt,
EfficientNet, DenseNet, RegNet, MobileNetV3, VGG, MaxViT, and more.
ViT / DeiT / Swin variants live under timm/vit/.
$ torchgeo-bench run model=timm/resnet50
$ torchgeo-bench run model=timm/convnext_base dataset.names=[m-eurosat]
$ torchgeo-bench run model=timm/vit/vit_base_patch16_224 dataset.image_size=224
$ torchgeo-bench run model=timm/vit/swin_base_patch4_window7_224 eval.skip_linear=true
ViT-style backbones expect a fixed spatial resolution. Set
dataset.image_size=224 (bilinear by default; switch to
bicubic / nearest via dataset.interpolation) to resize the
dataset tiles for any model.
timm models rebuild their input convolution for any number of channels —
they work with dataset.bands=all out of the box (pretrained
3-channel weights are averaged / replicated as needed).
torchgeo foundation models#
Configs under src/torchgeo_bench/conf/model/torchgeo/. Most are
RGB-only self-supervised checkpoints from torchgeo’s model hub.
$ # Sentinel-2 RGB SSL
$ torchgeo-bench run model=torchgeo/resnet50_s2rgb_moco
$ torchgeo-bench run model=torchgeo/resnet18_s2rgb_seco
$ torchgeo-bench run model=torchgeo/resnet50_fmow_gassl
$ # ScaleMAE on fMoW RGB
$ torchgeo-bench run model=torchgeo/scalemae_large_fmow
$ # DOFA — band-agnostic (currently configured for Sentinel-2 RGB wavelengths)
$ torchgeo-bench run model=torchgeo/dofa_base
$ # Satlas Swin-V2 (NAIP / Sentinel-2 RGB)
$ torchgeo-bench run model=torchgeo/swinv2b_naip_satlas_mi
$ torchgeo-bench run model=torchgeo/swinv2b_s2rgb_satlas_mi
$ # EarthLoc place-recognition descriptor
$ torchgeo-bench run model=torchgeo/earthloc_s2_resnet50
OlmoEarth (AI2)#
OlmoEarthBenchModel. Requires the
optional olmoearth extra:
$ pip install 'torchgeo-bench[olmoearth]'
$ # OlmoEarth v1 (Nano / Tiny / Base / Large)
$ torchgeo-bench run model=olmoearth_nano
$ torchgeo-bench run model=olmoearth_base
$ torchgeo-bench run model=olmoearth_large dataset.bands=all
$ # OlmoEarth v1.1 (Nano / Tiny / Base)
$ torchgeo-bench run model=olmoearth_v1_1_nano
$ torchgeo-bench run model=olmoearth_v1_1_tiny
$ torchgeo-bench run model=olmoearth_v1_1_base
OlmoEarth v1.1 uses a linear patch embedding (vs. convolutional in v1),
a single bandset per modality, and updated masking/loss functions, yielding a
≈ 3× reduction in MACs with comparable accuracy. The version parameter
selects the weight family:
Config |
Version |
Size |
Notes |
|---|---|---|---|
|
v1 |
Nano |
multi-bandset, conv patch embed |
|
v1 |
Tiny |
|
|
v1 |
Base |
|
|
v1 |
Large |
|
|
v1.1 |
Nano |
single-bandset, linear patch embed |
|
v1.1 |
Tiny |
|
|
v1.1 |
Base |
Note
Input normalization is selected globally with dataset.normalization
(default bandspec_zscore). Each model receives that strategy through
BenchModel; use model_native for
wrappers that declare pretrained input units / statistics, or identity
when a backbone owns all normalization internally.
For datasets using Landsat imagery (e.g. m-forestnet), all OlmoEarth
configs route Landsat through the Sentinel-2 normalizer via
sensor_remap: {landsat: landsat_as_s2}; this is required because
GeoBench delivers Landsat as uint8 [0, 255] and the Landsat normalizer
expects a different dynamic range.
SAM 3 vision encoder#
SAM3Encoder. Requires the optional
sam3 extra and a local checkpoint at checkpoints/sam3/:
$ pip install 'torchgeo-bench[sam3]'
$ torchgeo-bench run model=sam3_encoder dataset.bands=[red,green,blue]
Adding a new model#
There are two contribution pathways. Stage 1 lets you benchmark your model locally and report results in a paper without opening a PR. Stage 2 covers the full code contribution: exporting the class, writing tests, hosting weights, and submitting a PR.
See also
- Evaluate your own model (Stage 1)
Stage 1 — evaluate your model locally and report results.
- Contribute a model (Stage 2)
Stage 2 — contribute the model as a PR to the shared benchmark.
Note
Two key patterns apply regardless of stage:
Do not put
bandsin the Hydra YAML. The runner reads the current dataset’sBandSpeclist and injects it into the constructor automatically. Addingbandsto the YAML causes aTypeError(duplicate keyword argument).Pass
normalization="identity"tosuper().__init__when your backbone handles normalization internally (e.g. OlmoEarth, Clay, any model whoseforward()runs its own per-channel standardization). The sealedforward_patch_featureswill then pass raw sensor values straight to your_forward_patch_featureswithout applying any additional z-score.
For segmentation models, also pick the
eval.segmentation.layers
that the head will hook into — see Segmentation backbone layer reference for
verified values per timm backbone family.