菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-07
📄 Abstract - BodhiPromptShield: Pre-Inference Prompt Mediation for Suppressing Privacy Propagation in LLM/VLM Agents

In LLM/VLM agents, prompt privacy risk propagates beyond a single model call because raw user content can flow into retrieval queries, memory writes, tool calls, and logs. Existing de-identification pipelines address document boundaries but not this cross-stage propagation. We propose BodhiPromptShield, a policy-aware framework that detects sensitive spans, routes them via typed placeholders, semantic abstraction, or secure symbolic mapping, and delays restoration to authorized boundaries. Relative to enterprise redaction, this adds explicit propagation-aware mediation and restoration timing as a security variable. Under controlled evaluation on the Controlled Prompt-Privacy Benchmark (CPPB), stage-wise propagation suppresses from 10.7\% to 7.1\% across retrieval, memory, and tool stages; PER reaches 9.3\% with 0.94 AC and 0.92 TSR, outperforming generic de-identification. These are controlled systems results on CPPB rather than formal privacy guarantees or public-benchmark transfer claims. The project repository is available at this https URL.

顶级标签: llm agents systems
详细标签: privacy protection prompt mediation de-identification multi-stage propagation security framework 或 搜索:

BodhiPromptShield:一种用于抑制LLM/VLM智能体中隐私传播的推理前提示词调解框架 / BodhiPromptShield: Pre-Inference Prompt Mediation for Suppressing Privacy Propagation in LLM/VLM Agents


1️⃣ 一句话总结

这篇论文提出了一个名为BodhiPromptShield的框架,它能在大型语言模型或视觉语言模型智能体执行任务前,通过检测、替换和延迟还原敏感信息的方式,有效阻止用户隐私在不同处理阶段(如检索、记忆、工具调用)之间传播,从而提升系统隐私保护能力。

源自 arXiv: 2604.05793