安全关键型LLM助手中的Agent间威胁:一种以人为中心的分类法 / Agent2Agent Threats in Safety-Critical LLM Assistants: A Human-Centric Taxonomy
1️⃣ 一句话总结
这篇论文针对汽车等安全关键场景中大语言模型助手之间的通信安全问题,提出了一种名为AgentHeLLM的威胁建模新框架,它通过严格区分‘保护什么’和‘如何攻击’,并引入以人为中心的危害分类法,来系统性地发现和分析可能导致严重后果的多阶段攻击路径。
The integration of Large Language Model (LLM)-based conversational agents into vehicles creates novel security challenges at the intersection of agentic AI, automotive safety, and inter-agent communication. As these intelligent assistants coordinate with external services via protocols such as Google's Agent-to-Agent (A2A), they establish attack surfaces where manipulations can propagate through natural language payloads, potentially causing severe consequences ranging from driver distraction to unauthorized vehicle control. Existing AI security frameworks, while foundational, lack the rigorous "separation of concerns" standard in safety-critical systems engineering by co-mingling the concepts of what is being protected (assets) with how it is attacked (attack paths). This paper addresses this methodological gap by proposing a threat modeling framework called AgentHeLLM (Agent Hazard Exploration for LLM Assistants) that formally separates asset identification from attack path analysis. We introduce a human-centric asset taxonomy derived from harm-oriented "victim modeling" and inspired by the Universal Declaration of Human Rights, and a formal graph-based model that distinguishes poison paths (malicious data propagation) from trigger paths (activation actions). We demonstrate the framework's practical applicability through an open-source attack path suggestion tool AgentHeLLM Attack Path Generator that automates multi-stage threat discovery using a bi-level search strategy.
安全关键型LLM助手中的Agent间威胁:一种以人为中心的分类法 / Agent2Agent Threats in Safety-Critical LLM Assistants: A Human-Centric Taxonomy
这篇论文针对汽车等安全关键场景中大语言模型助手之间的通信安全问题,提出了一种名为AgentHeLLM的威胁建模新框架,它通过严格区分‘保护什么’和‘如何攻击’,并引入以人为中心的危害分类法,来系统性地发现和分析可能导致严重后果的多阶段攻击路径。
源自 arXiv: 2602.05877