菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-01
📄 Abstract - LangMARL: Natural Language Multi-Agent Reinforcement Learning

Large language model (LLM) agents struggle to autonomously evolve coordination strategies in dynamic environments, largely because coarse global outcomes obscure the causal signals needed for local policy refinement. We identify this bottleneck as a multi-agent credit assignment problem, which has long been studied in classical multi-agent reinforcement learning (MARL) but remains underaddressed in LLM-based systems. Building on this observation, we propose LangMARL, a framework that brings credit assignment and policy gradient evolution from cooperative MARL into the language space. LangMARL introduces agent-level language credit assignment, pioneers gradient evolution in language space for policy improvement, and summarizes task-relevant causal relations from replayed trajectories to provide dense feedback and improve convergence under sparse rewards. Extensive experiments across diverse cooperative multi-agent tasks demonstrate improved sample efficiency, interpretability, and strong generalization.

顶级标签: llm agents multi-agents
详细标签: multi-agent reinforcement learning credit assignment policy gradient language agents cooperative tasks 或 搜索:

LangMARL:自然语言多智能体强化学习 / LangMARL: Natural Language Multi-Agent Reinforcement Learning


1️⃣ 一句话总结

这篇论文提出了一个名为LangMARL的新框架,它通过将经典多智能体强化学习中的信用分配和策略梯度进化思想引入到语言模型中,解决了大语言模型智能体在动态环境中难以自主进化协作策略的问题,从而提升了学习效率、可解释性和泛化能力。

源自 arXiv: 2604.00722