菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-29
📄 Abstract - NEMO: Execution-Aware Optimization Modeling via Autonomous Coding Agents

In this paper, we present NEMO, a system that translates Natural-language descriptions of decision problems into formal Executable Mathematical Optimization implementations, operating collaboratively with users or autonomously. Existing approaches typically rely on specialized large language models (LLMs) or bespoke, task-specific agents. Such methods are often brittle, complex and frequently generating syntactically invalid or non-executable code. NEMO instead centers on remote interaction with autonomous coding agents (ACAs), treated as a first-class abstraction analogous to API-based interaction with LLMs. This design enables the construction of higher-level systems around ACAs that structure, consolidate, and iteratively refine task specifications. Because ACAs execute within sandboxed environments, code produced by NEMO is executable by construction, allowing automated validation and repair. Building on this, we introduce novel coordination patterns with and across ACAs, including asymmetric validation loops between independently generated optimizer and simulator implementations (serving as a high-level validation mechanism), external memory for experience reuse, and robustness enhancements via minimum Bayes risk (MBR) decoding and self-consistency. We evaluate NEMO on nine established optimization benchmarks. As depicted in Figure 1, it achieves state-of-the-art performance on the majority of tasks, with substantial margins on several datasets, demonstrating the power of execution-aware agentic architectures for automated optimization modeling.

顶级标签: llm agents systems
详细标签: autonomous coding agents optimization modeling executable code generation agent coordination validation loops 或 搜索:

NEMO:通过自主编码代理实现执行感知的优化建模 / NEMO: Execution-Aware Optimization Modeling via Autonomous Coding Agents


1️⃣ 一句话总结

这篇论文提出了一个名为NEMO的系统,它能将用自然语言描述的决策问题自动转化为可执行的数学优化代码,并通过与自主编码代理的协同工作,确保生成的代码不仅能运行,还能通过新颖的协调机制进行验证和优化,从而在多个标准测试中取得了领先的性能。

源自 arXiv: 2601.21372