迈向稳定的半监督遥感分割:通过协同引导与协同融合 / Toward Stable Semi-Supervised Remote Sensing Segmentation via Co-Guidance and Co-Fusion
1️⃣ 一句话总结
这篇论文提出了一个名为Co2S的稳定半监督遥感图像分割框架,它通过巧妙结合CLIP和DINOv3两种视觉基础模型的优势,并设计协同引导与融合机制,有效解决了训练中伪标签漂移和错误累积的难题,从而在多种场景下实现了领先的分割精度。
Semi-supervised remote sensing (RS) image semantic segmentation offers a promising solution to alleviate the burden of exhaustive annotation, yet it fundamentally struggles with pseudo-label drift, a phenomenon where confirmation bias leads to the accumulation of errors during training. In this work, we propose Co2S, a stable semi-supervised RS segmentation framework that synergistically fuses priors from vision-language models and self-supervised models. Specifically, we construct a heterogeneous dual-student architecture comprising two distinct ViT-based vision foundation models initialized with pretrained CLIP and DINOv3 to mitigate error accumulation and pseudo-label drift. To effectively incorporate these distinct priors, an explicit-implicit semantic co-guidance mechanism is introduced that utilizes text embeddings and learnable queries to provide explicit and implicit class-level guidance, respectively, thereby jointly enhancing semantic consistency. Furthermore, a global-local feature collaborative fusion strategy is developed to effectively fuse the global contextual information captured by CLIP with the local details produced by DINOv3, enabling the model to generate highly precise segmentation results. Extensive experiments on six popular datasets demonstrate the superiority of the proposed method, which consistently achieves leading performance across various partition protocols and diverse scenarios. Project page is available at this https URL.
迈向稳定的半监督遥感分割:通过协同引导与协同融合 / Toward Stable Semi-Supervised Remote Sensing Segmentation via Co-Guidance and Co-Fusion
这篇论文提出了一个名为Co2S的稳定半监督遥感图像分割框架,它通过巧妙结合CLIP和DINOv3两种视觉基础模型的优势,并设计协同引导与融合机制,有效解决了训练中伪标签漂移和错误累积的难题,从而在多种场景下实现了领先的分割精度。
源自 arXiv: 2512.23035