菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-17
📄 Abstract - Formal Verification of Learned Multi-Agent Communication Policies via Decision Tree Distillation

Multi-agent reinforcement learning (MARL) enables agents to develop coordination strategies through emergent communication, but neural policies lack the formal safety guarantees required for safety-critical robotic deployment in drone swarms and autonomous vehicle fleets. We present the first end-to-end framework for safety verification of learned multi-agent communication policies through policy abstraction: neural policies are distilled into interpretable decision trees, then formally verified, with empirical validation confirming that verified safety properties transfer to original networks. Our four-stage pipeline consists of domain-specific feature extraction from agent observations, decision tree distillation achieving 97.9% +/- 1.2% fidelity to neural policies, automated translation to PRISM probabilistic model checker specifications with complete feature-to-state-variable correspondence, and compositional verification of Probabilistic Computation Tree Logic (PCTL) properties via pairwise decomposition with union-bound aggregation and empirical neighbor modeling. Evaluating Vector-Quantized Variational Information Bottleneck (VQ-VIB) policies for multi-drone coordination with 5-7 agents, we verify 18 temporal logic properties across safety, liveness, and cooperation, achieving 88.9% property satisfaction with all five safety thresholds satisfied (0.3% collision probability vs. 1% threshold). Monte Carlo validation of original neural policies confirms that verified safety properties transfer with <=0.6 percentage-point deviation (95% CI). Discrete VQ-VIB messages provide +11.6 to +13.6 percentage-point fidelity advantages over continuous methods, enabling 3-4x faster verification. Our framework provides empirically validated safety verification for distilled policy abstractions, serving as a practical bridge between deep MARL and formal safety workflows for multi-robot deployment.

顶级标签: reinforcement learning agents systems
详细标签: formal verification decision tree distillation multi-agent communication safety verification probabilistic model checking 或 搜索:

基于决策树蒸馏的多智能体通信策略形式化验证 / Formal Verification of Learned Multi-Agent Communication Policies via Decision Tree Distillation


1️⃣ 一句话总结

本文提出首个端到端框架,通过将多智能体强化学习中的神经网络通信策略蒸馏为可解释的决策树,再对其进行形式化验证,从而在保证安全性的前提下,将验证结果可靠地迁移回原始神经策略,最终实现了无人机集群等场景下低碰撞概率等安全属性的严格证明。

源自 arXiv: 2606.19632