ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control

📄 Abstract - ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control

Expert-level scientific reasoning remains challenging for large language models, particularly on benchmarks such as Humanity's Last Exam (HLE), where rigid tool pipelines, brittle multi-agent coordination, and inefficient test-time scaling often limit performance. We introduce ReThinker, a confidence-aware agentic framework that orchestrates retrieval, tool use, and multi-agent reasoning through a stage-wise Solver-Critic-Selector architecture. Rather than following a fixed pipeline, ReThinker dynamically allocates computation based on model confidence, enabling adaptive tool invocation, guided multi-dimensional reflection, and robust confidence-weighted selection. To support scalable training without human annotation, we further propose a reverse data synthesis pipeline and an adaptive trajectory recycling strategy that transform successful reasoning traces into high-quality supervision. Experiments on HLE, GAIA, and XBench demonstrate that ReThinker consistently outperforms state-of-the-art foundation models with tools and existing deep research systems, achieving state-of-the-art results on expert-level reasoning tasks.

ReThinker：通过引导反思与置信度控制进行科学推理 / ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control

1️⃣ 一句话总结

这篇论文提出了一个名为ReThinker的智能推理框架，它能让大型语言模型像专家一样进行科学推理，核心创新在于通过动态评估自身回答的‘信心程度’来灵活调用工具和反思纠错，从而在多个高难度科学考试基准上取得了目前最好的成绩。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要