菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-09
📄 Abstract - Can We Predict Before Executing Machine Learning Agents?

Autonomous machine learning agents have revolutionized scientific discovery, yet they remain constrained by a Generate-Execute-Feedback paradigm. Previous approaches suffer from a severe Execution Bottleneck, as hypothesis evaluation relies strictly on expensive physical execution. To bypass these physical constraints, we internalize execution priors to substitute costly runtime checks with instantaneous predictive reasoning, drawing inspiration from World Models. In this work, we formalize the task of Data-centric Solution Preference and construct a comprehensive corpus of 18,438 pairwise comparisons. We demonstrate that LLMs exhibit significant predictive capabilities when primed with a Verified Data Analysis Report, achieving 61.5% accuracy and robust confidence calibration. Finally, we instantiate this framework in FOREAGENT, an agent that employs a Predict-then-Verify loop, achieving a 6x acceleration in convergence while surpassing execution-based baselines by +6%. Our code and dataset will be publicly available soon at this https URL.

顶级标签: llm agents model evaluation
详细标签: autonomous agents predictive reasoning world models data-centric preference benchmark 或 搜索:

我们能在执行前预测机器学习智能体的行为吗? / Can We Predict Before Executing Machine Learning Agents?


1️⃣ 一句话总结

这篇论文提出了一种新方法,让AI智能体在执行任务前就能预测结果的好坏,从而跳过大量耗时的实际执行步骤,最终将学习效率提升了6倍,并且效果比传统方法更好。

源自 arXiv: 2601.05930