← 返回列表

🤖 系统

📄 Abstract - Contamination Detection for VLMs using Multi-Modal Semantic Perturbation

Recent advances in Vision-Language Models (VLMs) have achieved state-of-the-art performance on numerous benchmark tasks. However, the use of internet-scale, often proprietary, pretraining corpora raises a critical concern for both practitioners and users: inflated performance due to test-set leakage. While prior works have proposed mitigation strategies such as decontamination of pretraining data and benchmark redesign for LLMs, the complementary direction of developing detection methods for contaminated VLMs remains underexplored. To address this gap, we deliberately contaminate open-source VLMs on popular benchmarks and show that existing detection approaches either fail outright or exhibit inconsistent behavior. We then propose a novel simple yet effective detection method based on multi-modal semantic perturbation, demonstrating that contaminated models fail to generalize under controlled perturbations. Finally, we validate our approach across multiple realistic contamination strategies, confirming its robustness and effectiveness. The code and perturbed dataset will be released publicly.

顶级标签: model evaluation multi-modal computer vision

📄 论文总结

基于多模态语义扰动的视觉语言模型污染检测 / Contamination Detection for VLMs using Multi-Modal Semantic Perturbation

1️⃣ 一句话总结

这篇论文提出了一种通过多模态语义扰动来检测视觉语言模型是否在训练数据中泄露了测试集信息的新方法，有效解决了现有检测技术失效的问题。

📄 打开原文 PDF

← 返回列表

菜单

🤖 AI 深度阅读

📄 论文总结

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

🤖 AI 深度阅读

📄 论文总结

1️⃣ 一句话总结

获取最新论文摘要