Membership Inference Attacks against Large Audio Language Models

📄 Abstract - Membership Inference Attacks against Large Audio Language Models

We present the first systematic Membership Inference Attack (MIA) evaluation of Large Audio Language Models (LALMs). As audio encodes non-semantic information, it induces severe train and test distribution shifts and can lead to spurious MIA performance. Using a multi-modal blind baseline based on textual, spectral, and prosodic features, we demonstrate that common speech datasets exhibit near-perfect train/test separability (AUC approximately 1.0) even without model inference, and the standard MIA scores strongly correlate with these blind acoustic artifacts (correlation greater than 0.7). Using this blind baseline, we identify that distribution-matched datasets enable reliable MIA evaluation without distribution shift confounds. We benchmark multiple MIA methods and conduct modality disentanglement experiments on these datasets. The results reveal that LALM memorization is cross-modal, arising only from binding a speaker's vocal identity with its text. These findings establish a principled standard for auditing LALMs beyond spurious correlations.

针对大型音频语言模型的成员推理攻击 / Membership Inference Attacks against Large Audio Language Models

1️⃣ 一句话总结

这项研究首次系统性地评估了对大型音频语言模型的成员推理攻击，发现音频数据中的非语义特征（如说话人声音）会导致虚假的高攻击成功率，并揭示模型记忆的关键在于将说话人身份与文本内容绑定，而非单一模态信息。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要