From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG

📄 Abstract - From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG

With the rapid emergence of personal AI agents based on Large Language Models (LLMs), implementing them on-device has become essential for privacy and responsiveness. To handle the inherently personal and context-dependent nature of real-world requests, such agents must ground their generation in device-resident personal context. However, under tight memory budgets, the core bottleneck is what to store so that retrieval remains aligned with the user. We propose EPIC (Efficient Preference-aligned Index Construction), which focuses on user preferences as a compact and stable form of personal context and integrates them throughout the RAG pipeline. EPIC selectively retains preference-relevant information from raw data and aligns retrieval toward preference-aligned contexts. Across four benchmarks covering conversations, debates, explanations, and recommendations, EPIC reduces indexing memory by 2,404 times, improves preference-following accuracy by 20.17 percentage points, and achieves 33.33 times lower retrieval latency over the best-performing baseline. In our on-device experiment, EPIC maintains a memory footprint under 1 MB with 29.35 ms/query latency in streaming updates.

从容量到价值：面向设备端RAG的偏好对齐记忆构建 / From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG

1️⃣ 一句话总结

本文提出了一种名为EPIC的高效方法，通过在设备端（如手机）上只存储与用户个人偏好最相关的信息，并让检索过程聚焦于这些偏好，从而在极低内存占用（低于1MB）下大幅提升AI助手理解用户意图、遵循偏好的准确率（提升20%），同时将响应速度提升33倍以上。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要