基于双提示调优的主动CLIP自适应显式不确定性建模 / Explicit Uncertainty Modeling for Active CLIP Adaptation with Dual Prompt Tuning
1️⃣ 一句话总结
这篇论文提出了一种新方法,通过在CLIP模型中引入一正一反两个可学习的文本提示,不仅提升了模型对特定图像分类任务的识别能力,还能直接估算预测结果的可靠性,从而在有限的标注预算下,更智能地选择最有价值的样本进行人工标注,显著提高了主动学习的效率。
Pre-trained vision-language models such as CLIP exhibit strong transferability, yet adapting them to downstream image classification tasks under limited annotation budgets remains challenging. In active learning settings, the model must select the most informative samples for annotation from a large pool of unlabeled data. Existing approaches typically estimate uncertainty via entropy-based criteria or representation clustering, without explicitly modeling uncertainty from the model perspective. In this work, we propose a robust uncertainty modeling framework for active CLIP adaptation based on dual-prompt tuning. We introduce two learnable prompts in the textual branch of CLIP. The positive prompt enhances the discriminability of task-specific textual embeddings corresponding to light-weight tuned visual embeddings, improving classification reliability. Meanwhile, the negative prompt is trained in an reversed manner to explicitly model the probability that the predicted label is correct, providing a principled uncertainty signal for guiding active sample selection. Extensive experiments across different fine-tuning paradigms demonstrate that our method consistently outperforms existing active learning methods under the same annotation budget.
基于双提示调优的主动CLIP自适应显式不确定性建模 / Explicit Uncertainty Modeling for Active CLIP Adaptation with Dual Prompt Tuning
这篇论文提出了一种新方法,通过在CLIP模型中引入一正一反两个可学习的文本提示,不仅提升了模型对特定图像分类任务的识别能力,还能直接估算预测结果的可靠性,从而在有限的标注预算下,更智能地选择最有价值的样本进行人工标注,显著提高了主动学习的效率。
源自 arXiv: 2602.04340