超越静态角色:面向大语言模型的情境化人格引导 / Beyond Static Personas: Situational Personality Steering for Large Language Models
1️⃣ 一句话总结
这篇论文提出了一种名为IRIS的新方法,它能让大语言模型根据不同情境动态调整其“性格”或回应方式,无需额外训练,从而实现了更自然、适应性更强的人机交互。
Personalized Large Language Models (LLMs) facilitate more natural, human-like interactions in human-centric applications. However, existing personalization methods are constrained by limited controllability and high resource demands. Furthermore, their reliance on static personality modeling restricts adaptability across varying situations. To address these limitations, we first demonstrate the existence of situation-dependency and consistent situation-behavior patterns within LLM personalities through a multi-perspective analysis of persona neurons. Building on these insights, we propose IRIS, a training-free, neuron-based Identify-Retrieve-Steer framework for advanced situational personality steering. Our approach comprises situational persona neuron identification, situation-aware neuron retrieval, and similarity-weighted steering. We empirically validate our framework on PersonalityBench and our newly introduced SPBench, a comprehensive situational personality benchmark. Experimental results show that our method surpasses best-performing baselines, demonstrating IRIS's generalization and robustness to complex, unseen situations and different models architecture.
超越静态角色:面向大语言模型的情境化人格引导 / Beyond Static Personas: Situational Personality Steering for Large Language Models
这篇论文提出了一种名为IRIS的新方法,它能让大语言模型根据不同情境动态调整其“性格”或回应方式,无需额外训练,从而实现了更自然、适应性更强的人机交互。
源自 arXiv: 2604.13846