菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-08
📄 Abstract - The Persona Paradox: Medical Personas as Behavioral Priors in Clinical Language Models

Persona conditioning can be viewed as a behavioral prior for large language models (LLMs) and is often assumed to confer expertise and improve safety in a monotonic manner. However, its effects on high-stakes clinical decision-making remain poorly characterized. We systematically evaluate persona-based control in clinical LLMs, examining how professional roles (e.g., Emergency Department physician, nurse) and interaction styles (bold vs.\ cautious) influence behavior across models and medical tasks. We assess performance on clinical triage and patient-safety tasks using multidimensional evaluations that capture task accuracy, calibration, and safety-relevant risk behavior. We find systematic, context-dependent, and non-monotonic effects: Medical personas improve performance in critical care tasks, yielding gains of up to $\sim+20\%$ in accuracy and calibration, but degrade performance in primary-care settings by comparable margins. Interaction style modulates risk propensity and sensitivity, but it's highly model-dependent. While aggregated LLM-judge rankings favor medical over non-medical personas in safety-critical cases, we found that human clinicians show moderate agreement on safety compliance (average Cohen's $\kappa = 0.43$) but indicate a low confidence in 95.9\% of their responses on reasoning quality. Our work shows that personas function as behavioral priors that introduce context-dependent trade-offs rather than guarantees of safety or expertise. The code is available at this https URL\_Paradox.

顶级标签: llm medical model evaluation
详细标签: persona conditioning clinical decision-making safety evaluation behavioral prior human-ai alignment 或 搜索:

角色悖论:医学角色作为临床语言模型的行为先验 / The Persona Paradox: Medical Personas as Behavioral Priors in Clinical Language Models


1️⃣ 一句话总结

这项研究发现,给AI大语言模型设定医生或护士等专业角色,并不能保证它在所有医疗场景下都更安全或更专业,反而会带来复杂且不稳定的效果,比如在急救任务中表现更好,但在初级护理中表现更差。

源自 arXiv: 2601.05376