Why LLMs Aren't Scientists Yet: Lessons from Four Autonomous Research Attempts

📄 Abstract - Why LLMs Aren't Scientists Yet: Lessons from Four Autonomous Research Attempts

We report a case study of four end-to-end attempts to autonomously generate ML research papers using a pipeline of six LLM agents mapped to stages of the scientific workflow. Of these four, three attempts failed during implementation or evaluation. One completed the pipeline and was accepted to Agents4Science 2025, an experimental inaugural venue that required AI systems as first authors, passing both human and multi-AI review. From these attempts, we document six recurring failure modes: bias toward training data defaults, implementation drift under execution pressure, memory and context degradation across long-horizon tasks, overexcitement that declares success despite obvious failures, insufficient domain intelligence, and weak scientific taste in experimental design. We conclude by discussing four design principles for more robust AI-scientist systems, implications for autonomous scientific discovery, and we release all prompts, artifacts, and outputs at this https URL

为何大语言模型尚非科学家：来自四次自主研究尝试的启示 / Why LLMs Aren't Scientists Yet: Lessons from Four Autonomous Research Attempts

1️⃣ 一句话总结

这篇论文通过四次让大语言模型自主生成机器学习研究论文的尝试，发现其中三次失败，揭示了AI在自主科研中存在的六大常见缺陷，并提出了构建更可靠AI科学家系统的设计原则。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要