基于适配器引导的SAMv2全局-局部特征解码的显著目标检测 / Global-Local Feature Decoding with Adapter-Guided SAMv2 for Salient Object Detection
1️⃣ 一句话总结
本文提出了一种名为GLASSNet的方法,通过冻结大型视觉模型SAMv2并添加轻量适配器,再用双解码器分别捕捉全局语义和局部细节,从而高效且精确地完成显著目标检测,性能超越现有方法。
Salient Object Detection (SOD) remains an essential yet underexplored task in the era of large-scale vision models. Although foundation models like SAM exhibit strong generalization, their potential for SOD is not fully realized, and training or fully fine-tuning them is computationally expensive and prone to overfitting under limited data. To overcome these challenges, we introduce GLASSNet, a Global-Local feature decoding framework that uses SAMv2 as a frozen encoder paired with a lightweight, spatially aware convolutional adapter-reducing learnable encoder parameters by over 97%. To enhance saliency quality, GLASSNet employs a dual-decoder architecture: one decoder captures global, long-range semantics with an expanded receptive field, while the other captures fine local details such as edges and textures. Fusing these complementary cues yields saliency maps that combine global coherence with local precision, producing accurate final masks. Extensive experiments on standard SOD and camouflaged object detection benchmarks show that GLASSNet surpasses state-of-the-art methods, demonstrating the power of frozen foundation models combined with targeted adaptation and global-local decoding.
基于适配器引导的SAMv2全局-局部特征解码的显著目标检测 / Global-Local Feature Decoding with Adapter-Guided SAMv2 for Salient Object Detection
本文提出了一种名为GLASSNet的方法,通过冻结大型视觉模型SAMv2并添加轻量适配器,再用双解码器分别捕捉全局语义和局部细节,从而高效且精确地完成显著目标检测,性能超越现有方法。
源自 arXiv: 2605.02616