从被动生成到主动探究:一种主动型科学同行评审智能体 / From Passive Generation to Investigation: A Proactive Scientific Peer Review Agent
1️⃣ 一句话总结
本文提出一种名为ProReviewer的AI审稿智能体,它像人类审稿人一样,能主动追踪论文中的疑点并逐步收集证据,从而生成更有依据、更深入的同行评审意见——实验表明,即使使用较小的模型,其评审质量也显著优于当前主流方法。
Large language models (LLMs) have shown promise in automating scientific peer review. However, existing approaches often struggle to generate in-depth reviews supported by concrete evidence. We argue that a key limitation is the lack of flexibility to proactively investigate suspicious parts of a paper based on accumulated evidence, as human reviewers do. In this paper, we explore how to enable an LLM-based review agent to perform such proactive investigation. We find that this can be naturally formulated as a Markov Decision Process (MDP), and propose ProReviewer, a scientific peer review agent that proactively reviews a paper guided by a maintained, structured review log. The structured review log serves as a workspace for the agent to track evidence and intermediate findings collected during review. Experiments show that ProReviewer with an 8B backbone, trained by supervised fine-tuning and optimized by reinforcement learning, achieves the highest average score across five quality dimensions, outperforming prompt-based methods with much larger frontier LLMs by up to 39% and the strongest fine-tuned baseline by 16% relatively. It also attains the highest win rates against baselines in human evaluation.
从被动生成到主动探究:一种主动型科学同行评审智能体 / From Passive Generation to Investigation: A Proactive Scientific Peer Review Agent
本文提出一种名为ProReviewer的AI审稿智能体,它像人类审稿人一样,能主动追踪论文中的疑点并逐步收集证据,从而生成更有依据、更深入的同行评审意见——实验表明,即使使用较小的模型,其评审质量也显著优于当前主流方法。
源自 arXiv: 2606.13349