📄
Abstract - SidConArena: An Environment Evaluating Agents in Open-Ended,Positive-Sum Bargaining Game
Evaluating LLM agents requires dynamic environments that go beyond static reasoning and zero-sum games. Real-world economic interaction is often open-ended and mixed-motive: agents must negotiate, create positive-sum surplus, compete for scarce assets, and plan under delayed returns. We introduce SidConArena, a new benchmark framework for evaluating LLM agents in open-ended, positive-sum bargaining. SidConArena formalizes a multi-player economy as a finite-horizon partially observable stochastic game with three coupled phases: natural-language negotiation with binding trades, deterministic converter-based production, and sealed-bid auctions for long-term assets. The framework combines structured observations, phase-aware agent dispatching, a neural-symbolic action interface, and asynchronous execution, enabling free-form interaction while preserving rule-grounded evaluation. Across homogeneous and heterogeneous tournaments, stronger frontier models achieve higher economic outcomes, yet agents still misvalue resources, bargain passively, and remain limited in long-horizon investment planning.
SidConArena:在开放式、正和博弈谈判中评估智能体的环境 /
SidConArena: An Environment Evaluating Agents in Open-Ended,Positive-Sum Bargaining Game
1️⃣ 一句话总结
这篇论文提出了一个名为SidConArena的新型测试框架,它通过模拟包含谈判、生产和投资拍卖的多人经济游戏,来评估AI智能体在开放式合作与竞争场景下的综合决策能力,并发现当前最先进的模型虽然表现较好,但仍存在资源误判、谈判被动和长期规划不足等缺陷。