菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-20
📄 Abstract - A Controlled Benchmark of Visual State-Space Backbones with Domain-Shift and Boundary Analysis for Remote-Sensing Segmentation

Visual state-space models (SSMs) are increasingly promoted as efficient alternatives to Vision Transformers, yet their practical advantages remain unclear under fair comparison because existing studies rarely isolate encoder effects from decoder and training choices. We present a strictly controlled benchmark of representative visual SSM families, including VMamba, MambaVision, and Spatial-Mamba, for remote-sensing semantic segmentation, in which only the encoder varies across experiments. Evaluated on LoveDA and ISPRS Potsdam under a unified 4-stage feature interface and a fixed lightweight decoder, the benchmark reveals three main findings, intra-family scaling yields only modest gains, cross-domain generalization is strongly asymmetric, and boundary delineation is the dominant failure mode under distribution shift. Although visual SSMs achieve favorable accuracy-efficiency trade-offs relative to the controlled CNN and Transformer baselines considered here, the results suggest that future improvements are more likely to come from robustness-oriented design and boundary-aware decoding than from encoder scaling alone. By isolating encoder behavior under a unified and reproducible protocol, this study establishes a practical reference benchmark for the design and evaluation of future Mamba-based segmentation backbones

顶级标签: computer vision model evaluation machine learning
详细标签: benchmark visual state-space models semantic segmentation remote sensing domain shift 或 搜索:

面向遥感分割的视觉状态空间骨干网络受控基准测试:领域偏移与边界分析 / A Controlled Benchmark of Visual State-Space Backbones with Domain-Shift and Boundary Analysis for Remote-Sensing Segmentation


1️⃣ 一句话总结

本文构建了一个严格控制的基准实验,在统一解码器下比较多种视觉状态空间模型(如VMamba)在遥感图像分割中的表现,发现此类模型在精度和效率间取得了良好平衡,但面对不同数据分布时边界分割成为主要瓶颈,未来提升应更关注鲁棒性设计和边界感知解码,而非单纯扩大模型规模。

源自 arXiv: 2604.18721