菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-25
📄 Abstract - Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search

Recent advances in the Model Context Protocol (MCP) have enabled large language models (LLMs) to invoke external tools with unprecedented ease. This creates a new class of powerful and tool augmented agents. Unfortunately, this capability also introduces an under explored attack surface, specifically the malicious manipulation of tool responses. Existing techniques for indirect prompt injection that target MCP suffer from high deployment costs, weak semantic coherence, or heavy white box requirements. Furthermore, they are often easily detected by recently proposed defenses. In this paper, we propose Tree structured Injection for Payloads (TIP), a novel black-box attack which generates natural payloads to reliably seize control of MCP enabled agents even under defense. Technically, We cast payload generation as a tree structured search problem and guide the search with an attacker LLM operating under our proposed coarse-to-fine optimization framework. To stabilize learning and avoid local optima, we introduce a path-aware feedback mechanism that surfaces only high quality historical trajectories to the attacker model. The framework is further hardened against defensive transformations by explicitly conditioning the search on observable defense signals and dynamically reallocating the exploration budget. Extensive experiments on four mainstream LLMs show that TIP attains over 95% attack success in undefended settings while requiring an order of magnitude fewer queries than prior adaptive attacks. Against four representative defense approaches, TIP preserves more than 50% effectiveness and significantly outperforms the state-of-the-art attacks. By implementing the attack on real world MCP systems, our results expose an invisible but practical threat vector in MCP deployments. We also discuss potential mitigation approaches to address this critical security gap.

顶级标签: llm agents systems
详细标签: prompt injection model security adversarial attack tool-augmented agents black-box attack 或 搜索:

来自模型上下文协议的隐形威胁:通过基于树的自适应搜索生成隐蔽的注入载荷 / Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search


1️⃣ 一句话总结

这篇论文提出了一种名为TIP的新型黑盒攻击方法,它能够生成自然且隐蔽的恶意指令,有效劫持那些能够调用外部工具的大型语言模型智能体,即使在有防御措施的情况下也能保持较高的成功率,从而揭示了此类AI系统一个实际且严重的安全漏洞。

源自 arXiv: 2603.24203