菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-24
📄 Abstract - SidConArena: An Environment Evaluating Agents in Open-Ended,Positive-Sum Bargaining Game

Evaluating LLM agents requires dynamic environments that go beyond static reasoning and zero-sum games. Real-world economic interaction is often open-ended and mixed-motive: agents must negotiate, create positive-sum surplus, compete for scarce assets, and plan under delayed returns. We introduce SidConArena, a new benchmark framework for evaluating LLM agents in open-ended, positive-sum bargaining. SidConArena formalizes a multi-player economy as a finite-horizon partially observable stochastic game with three coupled phases: natural-language negotiation with binding trades, deterministic converter-based production, and sealed-bid auctions for long-term assets. The framework combines structured observations, phase-aware agent dispatching, a neural-symbolic action interface, and asynchronous execution, enabling free-form interaction while preserving rule-grounded evaluation. Across homogeneous and heterogeneous tournaments, stronger frontier models achieve higher economic outcomes, yet agents still misvalue resources, bargain passively, and remain limited in long-horizon investment planning.

顶级标签: llm agents benchmark
详细标签: llm agents bargaining game open-ended evaluation mixed-motive multi-agent economy 或 搜索:

SidConArena:在开放式、正和博弈谈判中评估智能体的环境 / SidConArena: An Environment Evaluating Agents in Open-Ended,Positive-Sum Bargaining Game


1️⃣ 一句话总结

这篇论文提出了一个名为SidConArena的新型测试框架,它通过模拟包含谈判、生产和投资拍卖的多人经济游戏,来评估AI智能体在开放式合作与竞争场景下的综合决策能力,并发现当前最先进的模型虽然表现较好,但仍存在资源误判、谈判被动和长期规划不足等缺陷。

源自 arXiv: 2606.27397