菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-22
📄 Abstract - GameTalk: Training LLMs for Strategic Conversation

Strategic decision-making in multi-agent settings is a key challenge for large language models (LLMs), particularly when coordination and negotiation must unfold over extended conversations. While recent work has explored the use of LLMs in isolated decision tasks, little attention has been given to optimizing long-term objectives through dialogue. We introduce \textbf{GameTalk}, a framework for training LLMs to make strategic decisions via multi-turn interactions. Unlike prior work that focuses on single-turn objectives or static action prediction, we train LLMs to optimize a global objective across full conversations. We achieve this by adapting fine-tuning methods like GRPO, DPO, and STaR to incorporate reward signals that depend on the entire interaction. We evaluate this approach on a suite of increasingly complex games, designed to stress different aspects of reasoning, coordination, and opponent modeling. Our results show that GameTalk significantly outperforms untrained models, especially under reward shaping, with DPO consistently yielding the strongest gains. These findings position conversational fine-tuning as a promising path for LLMs to reason, negotiate, and act in interactive environments.

顶级标签: llm agents model training
详细标签: strategic conversation multi-agent dialogue reinforcement learning fine-tuning coordination opponent modeling 或 搜索:

GameTalk:为战略性对话训练大语言模型 / GameTalk: Training LLMs for Strategic Conversation


1️⃣ 一句话总结

这篇论文提出了一个名为GameTalk的新框架,通过让大语言模型在多轮对话中学习优化长期目标,显著提升了它们在需要策略性推理、协调和谈判的复杂互动环境中的表现。

源自 arXiv: 2601.16276