菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-27
📄 Abstract - MRMMIA: Membership Inference Attacks on Memory in Chat Agents

Membership inference attacks (MIAs) test whether a target data record belongs to a system's private data, and have become a standard tool to measure privacy leakage in machine learning systems. Prior work has primarily focused on training corpora or retrieval databases. However, MIAs against agent memory have received less attention, even though such memory can contain sensitive user-agent interactions, retrieved facts, and user preferences. Therefore, in this work, we focus on chat agent memory MIAs, where an adversary infers whether a candidate memory unit belongs to the chat agent's memory store. We propose Multi-Recall Memory MIA (MRMMIA), a unified attack that utilizes multiple recall probes to the agent to extract the membership signal across black-box, gray-box, and white-box settings. Our experiments demonstrate that MRMMIA consistently outperforms baselines. Our results expose the privacy risk in agents and provide an initial evaluation framework for membership leakage in chat-agent memory systems.

顶级标签: llm agents
详细标签: membership inference attack privacy leakage agent memory black-box attack evaluation framework 或 搜索:

MRMMIA:聊天代理中基于记忆的成员推理攻击 / MRMMIA: Membership Inference Attacks on Memory in Chat Agents


1️⃣ 一句话总结

本文提出了一种名为MRMMIA的统一攻击方法,通过多次向聊天代理发送记忆查询,来探测其是否存储了特定用户数据,从而揭示聊天代理记忆系统中的隐私泄露风险,实验证明该方法在不同攻击场景下均优于现有技术。

源自 arXiv: 2605.27825