菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-16
📄 Abstract - POLCA: Stochastic Generative Optimization with LLM

Optimizing complex systems, ranging from LLM prompts to multi-turn agents, traditionally requires labor-intensive manual iteration. We formalize this challenge as a stochastic generative optimization problem where a generative language model acts as the optimizer, guided by numerical rewards and text feedback to discover the best system. We introduce Prioritized Optimization with Local Contextual Aggregation (POLCA), a scalable framework designed to handle stochasticity in optimization -- such as noisy feedback, sampling minibatches, and stochastic system behaviors -- while effectively managing the unconstrained expansion of solution space. POLCA maintains a priority queue to manage the exploration-exploitation tradeoff, systematically tracking candidate solutions and their evaluation histories. To enhance efficiency, we integrate an $\varepsilon$-Net mechanism to maintain parameter diversity and an LLM Summarizer to perform meta-learning across historical trials. We theoretically prove that POLCA converges to near-optimal candidate solutions under stochasticity. We evaluate our framework on diverse benchmarks, including $\tau$-bench, HotpotQA (agent optimization), VeriBench (code translation) and KernelBench (CUDA kernel generation). Experimental results demonstrate that POLCA achieves robust, sample and time-efficient performance, consistently outperforming state-of-the-art algorithms in both deterministic and stochastic problems. The codebase for this work is publicly available at this https URL.

顶级标签: llm model training agents
详细标签: generative optimization stochastic optimization meta-learning priority queue exploration-exploitation 或 搜索:

POLCA:基于大语言模型的随机生成式优化框架 / POLCA: Stochastic Generative Optimization with LLM


1️⃣ 一句话总结

这篇论文提出了一个名为POLCA的新型优化框架,它利用大语言模型作为优化器,通过结合奖励反馈和历史经验,高效且鲁棒地自动优化复杂的系统(如提示词或多轮智能体),并在多种任务上超越了现有方法。

源自 arXiv: 2603.14769