AgentTrace: Causal Graph Tracing for Root Cause Analysis in Deployed Multi-Agent Systems

📄 Abstract - AgentTrace: Causal Graph Tracing for Root Cause Analysis in Deployed Multi-Agent Systems

As multi-agent AI systems are increasingly deployed in real-world settings - from automated customer support to DevOps remediation - failures become harder to diagnose due to cascading effects, hidden dependencies, and long execution traces. We present AgentTrace, a lightweight causal tracing framework for post-hoc failure diagnosis in deployed multi-agent workflows. AgentTrace reconstructs causal graphs from execution logs, traces backward from error manifestations, and ranks candidate root causes using interpretable structural and positional signals - without requiring LLM inference at debugging time. Across a diverse benchmark of multi-agent failure scenarios designed to reflect common deployment patterns, AgentTrace localizes root causes with high accuracy and sub-second latency, significantly outperforming both heuristic and LLM-based baselines. Our results suggest that causal tracing provides a practical foundation for improving the reliability and trustworthiness of agentic systems in the wild.

AgentTrace：用于已部署多智能体系统根因分析的因果图追踪框架 / AgentTrace: Causal Graph Tracing for Root Cause Analysis in Deployed Multi-Agent Systems

1️⃣ 一句话总结

这篇论文提出了一个名为AgentTrace的轻量级框架，它通过分析系统运行日志自动构建因果图，能够快速、准确地定位多智能体系统故障的根本原因，而无需在调试时调用大语言模型，从而提升了这类系统的可靠性和可维护性。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要