ADE20K Segmentation Probe — canvas 12×12 @ 512px scene

Linear segmentation probe on the canvas features of canvit/canvitb16-add-vpe-pretrain-g128px-s512px-in21k-dv3b16-2026-02-02.

Paper: arXiv:2603.22570
Training code: github.com/m2b3/CanViT-specialize

Usage

uv add "canvit-pytorch @ git+https://github.com/m2b3/CanViT-PyTorch.git"

import torch
from canvit_pytorch.probes import SegmentationProbe

probe = SegmentationProbe.from_pretrained("canvit/probe-ade20k-40k-s512-c12-in21k").eval()

# [B, H, W, D] canvas features from a CanViT forward pass
features = torch.randn(1, 12, 12, 1024)
with torch.inference_mode():
    logits = probe(features)    # [B, num_classes, H, W]
assert logits.shape == (1, 150, 12, 12)

Training

Architecture: LayerNorm → Dropout → BatchNorm → Conv1×1.

Hyperparameter	Value
Scene size	512 px
Canvas grid	12 × 12
Glimpse size	128 px
Timesteps (T)	10
Training policy	R-IID
Optimizer	AdamW
Peak LR	$3 \times 10^{-4}$
Weight decay	$10^{-3}$
LR schedule	1,500-step warmup → cosine decay
Batch size	16
Max steps	40,000
Dropout	0.1
Augmentation	RandomResizedCrop scale [0.5, 2] + HFlip
Precision	bf16 (AMP)