菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-23
📄 Abstract - Revealing Training Data Exposure in Vision Language Large Models via Parameter Gradients

Vision-Language Large Models (VLLMs) trained on massive crawled corpora raise pressing copyright and data-provenance concerns. These concerns are particularly acute in healthcare, where patient medical images paired with clinical reports demand rigorous privacy safeguards. However, existing training data detection methods either fail in cross-modal scenarios or rely on superficial output signals with insufficient discriminative power. We introduce GradAudit, a gradient-based auditing framework that examines internal optimization dynamics rather than treating VLLMs as black boxes. Our approach builds on a key observation: model parameters converge to regions where gradients on training samples become stable and well-aligned, whereas gradients on non-training samples remain noisy and inconsistent. By analyzing these gradient signatures, GradAudit achieves strong separability and detects genuine image-text associations learned during training, not merely individual modality membership. Empirically, across both medical and general-domain datasets, GradAudit substantially outperforms state-of-the-art baselines in both pretraining and fine-tuning VLLMs. In a case study employing copyrighted content, we show that existing training data detection methods not only underestimate the extent of unauthorized data usage, but that this underestimation becomes more pronounced as models become more recent and more advanced.

顶级标签: llm multi-modal medical
详细标签: data exposure gradient analysis training data detection vision language model privacy 或 搜索:

通过参数梯度揭示视觉语言大模型中的训练数据暴露问题 / Revealing Training Data Exposure in Vision Language Large Models via Parameter Gradients


1️⃣ 一句话总结

本文提出了一种名为GradAudit的梯度审计框架,通过分析模型训练过程中的梯度稳定性差异,有效检测出视觉语言大模型是否使用了特定的图像-文本配对训练数据,从而帮助识别潜在的版权侵犯或隐私泄露问题。

源自 arXiv: 2606.24774