菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-17
📄 Abstract - ARISE: Agent Reasoning with Intrinsic Skill Evolution in Hierarchical Reinforcement Learning

The dominant paradigm for improving mathematical reasoning in language models relies on Reinforcement Learning with verifiable rewards. Yet existing methods treat each problem instance in isolation without leveraging the reusable strategies that emerge and accumulate during training. To this end, we introduce ARISE (Agent Reasoning via Intrinsic Skill Evolution), a hierarchical reinforcement learning framework, in which a shared policy operates both to manage skills at high-level and to generate responses at low-level (denoted as a Skills Manager and a Worker, respectively). The Manager maintains a tiered skill library through a dedicated skill generation rollout that performs structured summarization of successful solution traces (after execution), while employing a policy-driven selection mechanism to retrieve relevant skills to condition future rollouts (before execution). A hierarchical reward design guides the co-evolution of reasoning ability and library quality. Experiments on two base models and seven benchmarks spanning both competition mathematics and Omni-MATH show that ARISE consistently outperforms GRPO-family algorithms and memory-augmented baselines, with particularly notable gains on out-of-distribution tasks. Ablation studies confirm that each component contributes to the observed improvements and that library quality and reasoning performance improve in tandem throughout training. Code is available at \href{this https URL}{this https URL}.

顶级标签: reinforcement learning agents model training
详细标签: hierarchical rl skill library mathematical reasoning policy learning co-evolution 或 搜索:

ARISE:分层强化学习中基于内在技能演化的智能体推理 / ARISE: Agent Reasoning with Intrinsic Skill Evolution in Hierarchical Reinforcement Learning


1️⃣ 一句话总结

这篇论文提出了一个名为ARISE的分层强化学习框架,它通过让智能体在训练中自动总结和复用成功的解题策略(技能),来持续提升大语言模型的数学推理能力,尤其在处理未见过的难题时效果显著。

源自 arXiv: 2603.16060