D-MEM:基于奖励预测误差路由的多巴胺门控智能体记忆 / D-MEM: Dopamine-Gated Agentic Memory via Reward Prediction Error Routing
1️⃣ 一句话总结
这篇论文提出了一种受大脑多巴胺机制启发的新型智能体记忆架构,它通过评估信息的‘意外程度’和‘有用性’来决定是快速缓存还是深度整合知识,从而在保证长期学习能力的同时,大幅降低了计算和资源消耗。
Autonomous LLM agents require structured long-term memory, yet current "append-and-evolve" systems like A-MEM face O(N^2) write-latency and excessive token costs. We introduce D-MEM (Dopamine-Gated Agentic Memory), a biologically inspired architecture that decouples short-term interaction from cognitive restructuring via a Fast/Slow routing system based on Reward Prediction Error (RPE). A lightweight Critic Router evaluates stimuli for Surprise and Utility. Routine, low-RPE inputs are bypassed or cached in an O(1) fast-access buffer. Conversely, high-RPE inputs, such as factual contradictions or preference shifts, trigger a "dopamine" signal, activating the O(N) memory evolution pipeline to reshape the agent's knowledge graph. To evaluate performance under realistic conditions, we introduce the LoCoMo-Noise benchmark, which injects controlled conversational noise into long-term sessions. Evaluations demonstrate that D-MEM reduces token consumption by over 80%, eliminates O(N^2) bottlenecks, and outperforms baselines in multi-hop reasoning and adversarial resilience. By selectively gating cognitive restructuring, D-MEM provides a scalable, cost-efficient foundation for lifelong agentic memory.
D-MEM:基于奖励预测误差路由的多巴胺门控智能体记忆 / D-MEM: Dopamine-Gated Agentic Memory via Reward Prediction Error Routing
这篇论文提出了一种受大脑多巴胺机制启发的新型智能体记忆架构,它通过评估信息的‘意外程度’和‘有用性’来决定是快速缓存还是深度整合知识,从而在保证长期学习能力的同时,大幅降低了计算和资源消耗。
源自 arXiv: 2603.14597