Mem-π:通过学会何时生成以及生成什么来实现自适应记忆 / Mem-$π$: Adaptive Memory through Learning When and What to Generate
1️⃣ 一句话总结
这篇论文提出了一种名为Mem-π的新型AI记忆框架,它不依赖传统的外部数据库检索,而是训练一个独立的模型,在需要时动态生成针对当前任务的有用指引,从而让AI代理(如浏览器操作或机器人)在复杂任务中表现更佳,尤其在网页导航任务上性能提升了30%以上。
We present Mem-$\pi$, a framework for adaptive memory in large language model (LLM) agents, where useful guidance is generated on demand rather than retrieved from external memory stores. Existing memory-augmented agents typically rely on similarity-based retrieval from episodic memory banks or skill libraries, returning static entries that often misalign with the current context. In contrast, Mem-$\pi$ uses a dedicated language or vision-language model with its own parameters, separate from the downstream agent, to generate context-specific guidance for complex tasks. Conditioned on the current agent context, the model jointly decides when to produce guidance and what guidance to produce. We train it with a decision-content decoupled reinforcement learning (RL) objective, enabling it to abstain when generation would not help and otherwise produce concise, useful guidance. Across diverse agentic benchmarks spanning web navigation, terminal-based tool use, and text-based embodied interaction, Mem-$\pi$ consistently outperforms retrieval-based and prior RL-optimized memory baselines, achieving over 30% relative improvement on web navigation tasks.
Mem-π:通过学会何时生成以及生成什么来实现自适应记忆 / Mem-$π$: Adaptive Memory through Learning When and What to Generate
这篇论文提出了一种名为Mem-π的新型AI记忆框架,它不依赖传统的外部数据库检索,而是训练一个独立的模型,在需要时动态生成针对当前任务的有用指引,从而让AI代理(如浏览器操作或机器人)在复杂任务中表现更佳,尤其在网页导航任务上性能提升了30%以上。
源自 arXiv: 2605.21463