HiconAgent:面向图形界面智能体的历史上下文感知策略优化 / HiconAgent: History Context-aware Policy Optimization for GUI Agents
1️⃣ 一句话总结
这篇论文提出了一种名为HiconAgent的智能体,它通过一种创新的历史上下文感知优化方法,让图形界面操作机器人既能有效利用过去的操作经验来提升任务成功率,又能大幅减少计算负担,实现了性能与效率的双重提升。
Graphical User Interface (GUI) agents require effective use of historical context to perform sequential navigation tasks. While incorporating past actions and observations can improve decision making, naive use of full history leads to excessive computational overhead and distraction from irrelevant information. To address this, we introduce HiconAgent, a GUI agent trained with History Context-aware Policy Optimization (HCPO) for efficient and effective utilization of historical information. HCPO optimizes history usage in both sampling and policy updates through two complementary components: (1) Dynamic Context Sampling (DCS) presents the agent with variable length histories during sampling, enabling adaptive use of the most relevant context; (2) Anchor-guided History Compression (AHC) refines the policy update phase with a dual branch strategy where the compressed branch removes history observations while keeping history actions as information flow anchors. The compressed and uncompressed branches are coupled through a history-enhanced alignment loss to enforce consistent history usage while maintaining efficiency. Experiments on mainstream GUI navigation benchmarks demonstrate strong performance. Despite being smaller, HiconAgent-3B outperforms GUI-R1-7B by +8.46 percent grounding accuracy and +11.32 percent step success rate on GUI-Odyssey, while achieving comparable results on AndroidControl and AITW with up to 2.47x computational speedup and 60 percent FLOPs reduction.
HiconAgent:面向图形界面智能体的历史上下文感知策略优化 / HiconAgent: History Context-aware Policy Optimization for GUI Agents
这篇论文提出了一种名为HiconAgent的智能体,它通过一种创新的历史上下文感知优化方法,让图形界面操作机器人既能有效利用过去的操作经验来提升任务成功率,又能大幅减少计算负担,实现了性能与效率的双重提升。