DiffuSAM:用于遥感图像的扩散模型引导零样本目标定位 / DiffuSAM: Diffusion Guided Zero-Shot Object Grounding for Remote Sensing Imagery
1️⃣ 一句话总结
这篇论文提出了一种名为DiffuSAM的新方法,它巧妙地将能够理解文本的扩散模型与先进的图像分割模型结合起来,无需额外训练就能在复杂的遥感图像中更准确地找到并框出目标物体,实验证明其定位准确率比现有最好方法提升了超过14%。
Diffusion models have emerged as powerful tools for a wide range of vision tasks, including text-guided image generation and editing. In this work, we explore their potential for object grounding in remote sensing imagery. We propose a hybrid pipeline that integrates diffusion-based localization cues with state-of-the-art segmentation models such as RemoteSAM and SAM3 to obtain more accurate bounding boxes. By leveraging the complementary strengths of generative diffusion models and foundational segmentation models, our approach enables robust and adaptive object localization across complex scenes. Experiments demonstrate that our pipeline significantly improves localization performance, achieving over a 14% increase in Acc@0.5 compared to existing state-of-the-art methods.
DiffuSAM:用于遥感图像的扩散模型引导零样本目标定位 / DiffuSAM: Diffusion Guided Zero-Shot Object Grounding for Remote Sensing Imagery
这篇论文提出了一种名为DiffuSAM的新方法,它巧妙地将能够理解文本的扩散模型与先进的图像分割模型结合起来,无需额外训练就能在复杂的遥感图像中更准确地找到并框出目标物体,实验证明其定位准确率比现有最好方法提升了超过14%。
源自 arXiv: 2604.18201