BEAP-Agent:面向图形界面代理的可回溯执行与自适应规划框架 / BEAP-Agent: Backtrackable Execution and Adaptive Planning for GUI Agents
1️⃣ 一句话总结
这篇论文提出了一个名为BEAP-Agent的新框架,它通过将图形界面任务执行建模为深度优先搜索过程,并引入可回溯执行与自适应规划机制,有效解决了现有图形界面代理在任务探索中一旦走错步骤就难以恢复的问题,从而显著提升了处理复杂、长周期任务的鲁棒性和成功率。
GUI agents are designed to automate repetitive tasks and enhance productivity. However, existing GUI agents struggle to recover once they follow an incorrect exploration path, often leading to task failure. In this work, we model GUI task execution as a DFS process and propose BEAP-Agent, a DFS-based framework that supports long-range, multi-level state backtracking with dynamic task tracking and updating. The framework consists of three collaborative components: Planner, Executor, and Tracker. Together, they enable effective task exploration and execution. BEAP-Agent fills the gap in systematic backtracking mechanisms for GUI agents, offering a systematic solution for long-horizon task exploration. We conducted a systematic evaluation on the OSWorld benchmark, where BEAP-Agent achieved an accuracy of 28.2%, validating the effectiveness of the proposed method.
BEAP-Agent:面向图形界面代理的可回溯执行与自适应规划框架 / BEAP-Agent: Backtrackable Execution and Adaptive Planning for GUI Agents
这篇论文提出了一个名为BEAP-Agent的新框架,它通过将图形界面任务执行建模为深度优先搜索过程,并引入可回溯执行与自适应规划机制,有效解决了现有图形界面代理在任务探索中一旦走错步骤就难以恢复的问题,从而显著提升了处理复杂、长周期任务的鲁棒性和成功率。
源自 arXiv: 2601.21352