菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-25
📄 Abstract - StreamProfileBench: A Benchmark for Fine-Grained User Profile Inference in Real-World Streaming Scenarios

Large Language Models (LLMs) have reshaped user profiling, yet current evaluations mainly focus on static data snapshots. This paradigm overlooks the reality of personalized systems, where User-Generated Content (UGC) arrives continuously and fine-grained profile evolve rapidly. To bridge this gap, we introduce StreamProfileBench, a large-scale benchmark for fine-grained streaming user profiling. We formalize streaming user profiling as a continuous state maintenance task and curate a highly authentic dataset comprising over 120,000 UGC posts from 7,000+ real users across five diverse platforms. By leveraging the temporal correlation of user interests, we further propose a novel, annotation-free evaluation framework. Extensive experiments across 14 leading LLMs reveal that continuous profile updating remains an open challenge. Models exhibit a systemic conservative bias, over-retaining past interests while failing to recognize interest decay. Ablation experiments further validate the practical utility and necessity of the streaming paradigm.

顶级标签: llm benchmark data
详细标签: user profiling streaming temporal reasoning interest decay evaluation 或 搜索:

StreamProfileBench:面向真实流式场景的细粒度用户画像推断基准 / StreamProfileBench: A Benchmark for Fine-Grained User Profile Inference in Real-World Streaming Scenarios


1️⃣ 一句话总结

该论文提出了一个名为StreamProfileBench的大规模基准测试,用于评估大语言模型在连续、动态的用户生成内容流中实时推断和更新用户画像的能力,并揭示了当前模型存在保守偏差,即倾向于保留旧兴趣而忽略兴趣衰减的问题。

源自 arXiv: 2605.25758