菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-09
📄 Abstract - PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

We introduce Parallel Coordinated Reasoning (PaCoRe), a training-and-inference framework designed to overcome a central limitation of contemporary language models: their inability to scale test-time compute (TTC) far beyond sequential reasoning under a fixed context window. PaCoRe departs from the traditional sequential paradigm by driving TTC through massive parallel exploration coordinated via a message-passing architecture in multiple rounds. Each round launches many parallel reasoning trajectories, compacts their findings into context-bounded messages, and synthesizes these messages to guide the next round and ultimately produce the final answer. Trained end-to-end with large-scale, outcome-based reinforcement learning, the model masters the synthesis abilities required by PaCoRe and scales to multi-million-token effective TTC without exceeding context limits. The approach yields strong improvements across diverse domains, and notably pushes reasoning beyond frontier systems in mathematics: an 8B model reaches 94.5% on HMMT 2025, surpassing GPT-5's 93.2% by scaling effective TTC to roughly two million tokens. We open-source model checkpoints, training data, and the full inference pipeline to accelerate follow-up work.

顶级标签: llm model training agents
详细标签: test-time compute parallel reasoning reinforcement learning reasoning scaling message-passing 或 搜索:

PaCoRe:通过并行协同推理学习扩展测试时计算 / PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning


1️⃣ 一句话总结

这篇论文提出了一种名为PaCoRe的新框架,它通过让语言模型进行多轮并行探索和协同信息整合,成功突破了传统模型因上下文窗口限制而无法大幅增加计算量进行深度推理的瓶颈,使得一个小型模型在数学推理任务上能超越顶尖大模型。

源自 arXiv: 2601.05593