可验证的智能体间通信语义 / Verifiable Semantics for Agent-to-Agent Communication
1️⃣ 一句话总结
这篇论文提出了一种基于刺激-意义模型的认证协议,通过测试智能体对共享可观测事件的理解来验证术语语义的一致性,从而大幅降低智能体间因语义分歧导致的沟通错误。
Multiagent AI systems require consistent communication, but we lack methods to verify that agents share the same understanding of the terms used. Natural language is interpretable but vulnerable to semantic drift, while learned protocols are efficient but opaque. We propose a certification protocol based on the stimulus-meaning model, where agents are tested on shared observable events and terms are certified if empirical disagreement falls below a statistical threshold. In this protocol, agents restricting their reasoning to certified terms ("core-guarded reasoning") achieve provably bounded disagreement. We also outline mechanisms for detecting drift (recertification) and recovering shared vocabulary (renegotiation). In simulations with varying degrees of semantic divergence, core-guarding reduces disagreement by 72-96%. In a validation with fine-tuned language models, disagreement is reduced by 51%. Our framework provides a first step towards verifiable agent-to-agent communication.
可验证的智能体间通信语义 / Verifiable Semantics for Agent-to-Agent Communication
这篇论文提出了一种基于刺激-意义模型的认证协议,通过测试智能体对共享可观测事件的理解来验证术语语义的一致性,从而大幅降低智能体间因语义分歧导致的沟通错误。
源自 arXiv: 2602.16424