菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-04
📄 Abstract - GMBFormer: An NDVI-Guided Global Memory Bank Transformer for Urban Green-Space Extraction from Ultra-High-Resolution Imagery

Urban green-space extraction from ultra-high-resolution (UHR) imagery is commonly performed patch by patch, which limits semantic reuse among spatially separated but visually similar vegetation patterns. Directly injecting the Normalized Difference Vegetation Index (NDVI) into red-green-blue (RGB) backbones can also blur the roles of visual appearance learning and physical vegetation confidence. We propose GMBFormer, a SegFormer-based framework that replaces adjacency-driven feature propagation with selective, similarity-driven prototype retrieval. Only RGB channels enter the backbone and decoder, while NDVI is decoupled as a physics-informed gate that admits high-confidence vegetation descriptors into a compact global memory bank through momentum updates. During training and inference, the current patch queries stored prototypes through memory-mediated cross-attention, and the retrieved response is integrated with bounded overhead. Experiments use a self-constructed Chengdu UHR dataset with 7,700 labeled 512 x 512 patches and two reduced-label settings derived from the public International Society for Photogrammetry and Remote Sensing (ISPRS) Potsdam dataset. Under the same training and evaluation protocol, GMBFormer obtains mean intersection over union (mIoU)/mean Dice (mDice) scores of 89.25%/94.31%, 92.17%/95.92%, and 83.72%/90.86%, respectively, improving the controlled SegFormer-B4 baseline in each setting. Ablation studies indicate that decoupled NDVI admission, memory retrieval, capacity, and momentum jointly shape the final performance.

顶级标签: computer vision machine learning
详细标签: semantic segmentation urban green-space extraction memory bank remote sensing ndvi 或 搜索:

GMBFormer:一种基于NDVI引导的全局记忆库Transformer,用于从超高分辨率影像中提取城市绿地 / GMBFormer: An NDVI-Guided Global Memory Bank Transformer for Urban Green-Space Extraction from Ultra-High-Resolution Imagery


1️⃣ 一句话总结

该论文提出了一种名为GMBFormer的新型深度学习框架,通过将植被指数(NDVI)作为独立的“门控”信号,让模型在识别城市绿地的过程中,能够跨图像块地检索和重用相似的植被模式,从而显著提升了从超高分辨率遥感影像中提取城市绿地的准确性和效率。

源自 arXiv: 2606.06363