菜单

🤖 系统
📄 Abstract - PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits

Understanding human behavior traits is central to applications in human-computer interaction, computational social science, and personalized AI systems. Such understanding often requires integrating multiple modalities to capture nuanced patterns and relationships. However, existing resources rarely provide datasets that combine behavioral descriptors with complementary modalities such as facial attributes and biographical information. To address this gap, we present PersonaX, a curated collection of multimodal datasets designed to enable comprehensive analysis of public traits across modalities. PersonaX consists of (1) CelebPersona, featuring 9444 public figures from diverse occupations, and (2) AthlePersona, covering 4181 professional athletes across 7 major sports leagues. Each dataset includes behavioral trait assessments inferred by three high-performing large language models, alongside facial imagery and structured biographical features. We analyze PersonaX at two complementary levels. First, we abstract high-level trait scores from text descriptions and apply five statistical independence tests to examine their relationships with other modalities. Second, we introduce a novel causal representation learning (CRL) framework tailored to multimodal and multi-measurement data, providing theoretical identifiability guarantees. Experiments on both synthetic and real-world data demonstrate the effectiveness of our approach. By unifying structured and unstructured analysis, PersonaX establishes a foundation for studying LLM-inferred behavioral traits in conjunction with visual and biographical attributes, advancing multimodal trait analysis and causal reasoning.

顶级标签: llm multi-modal data
详细标签: behavioral traits multimodal datasets causal representation learning trait analysis biographical features 或 搜索:

📄 论文总结

PersonaX:包含LLM推断行为特征的多模态数据集 / PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits


1️⃣ 一句话总结

这篇论文提出了一个名为PersonaX的多模态数据集,它结合了大型语言模型推断的行为特征、面部图像和传记信息,为跨模态行为分析和因果推理研究提供了基础。


📄 打开原文 PDF