菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-25
📄 Abstract - SocialPersona: Benchmarking Personalized Profiling and Response with Multimodal Social-Media Context

Personalized language-model assistants are often evaluated through a memory lens: can a model recall preferences users have explicitly stated in dialogue? More comprehensive personalization demands a harder capability -- inferring what users care about from the multimodal traces they naturally leave behind. We introduce SocialPersona, a benchmark for evaluating whether multimodal large language models (MLLMs) can recover revealed preferences from longitudinal social-media timelines and use them in dialogue. Built from longitudinal timelines of 171 everyday, non-promotional social-media users, SocialPersona contains text, images, timestamps, and 2,597 human-verified preference tags across seven interest domains, separating stable interests from recent interests. It supports two tasks: constructing structured user profiles from multimodal context and generating responses aligned with inferred profiles. Experiments with proprietary and open-weight MLLMs show that models can identify broad interest domains, yet their performance drops on fine-grained and recent interests and degrades further when inferred profiles must be used to personalize dialogue. Together with evidence that text and images provide complementary preference signals, these results indicate that robust cross-modal, long-horizon user modeling remains a key challenge, and that SocialPersona can help measure and advance progress toward assistants that infer and act on revealed preferences.

顶级标签: multi-modal llm benchmark
详细标签: personalization user profiling preference inference social media dialogue 或 搜索:

社交人格:基于多模态社交媒体情境的个性化画像与应答基准测试 / SocialPersona: Benchmarking Personalized Profiling and Response with Multimodal Social-Media Context


1️⃣ 一句话总结

本文提出了一个名为SocialPersona的基准测试,通过分析171位普通用户在社交媒体上长时间发布的文本、图片和时间戳,来评估多模态大模型能否从这些自然留下的痕迹中推断出用户的真实兴趣偏好,并利用这些偏好生成个性化的对话回复,实验表明模型虽能识别宽泛兴趣,但在细粒度、近期兴趣及实际个性化对话应用上仍面临挑战。

源自 arXiv: 2606.26654