多轮共情对话中的话语多样性 / Discourse Diversity in Multi-Turn Empathic Dialogue
1️⃣ 一句话总结
这篇论文发现大型语言模型在多轮共情对话中会重复使用固定的话语策略,显得刻板,并提出了一个名为MINT的强化学习训练框架来提升其话语多样性,从而显著改善了共情对话的质量。
Large language models (LLMs) produce responses rated as highly empathic in single-turn settings (Ayers et al., 2023; Lee et al., 2024), yet they are also known to be formulaic generators that reuse the same lexical patterns, syntactic templates, and discourse structures across tasks (Jiang et al., 2025; Shaib et al., 2024; Namuduri et al., 2025). Less attention has been paid to whether this formulaicity extends to the level of discourse moves, i.e., what a response does for the person it is addressing. This question is especially consequential for empathic dialogue, where effective support demands not just a kind response at one moment but varied strategies as a conversation unfolds (Stiles et al., 1998). Indeed, prior work shows that LLMs reuse the same tactic sequences more than human supporters in single-turn settings (Gueorguieva et al., 2026). We extend this analysis to multi-turn conversations and find that the rigidity compounds: once a tactic appears in a supporter turn, LLMs reuse it in the next at nearly double the rate of humans (0.50-0.56 vs. 0.27). This pattern holds across LLMs serving as supporters in real emotional support conversations, and is invisible to standard similarity metrics. To address this gap, we introduce MINT (Multi-turn Inter-tactic Novelty Training), the first reinforcement learning framework to optimize discourse move diversity across multi-turn empathic dialogue. The best MINT variant combines an empathy quality reward with a cross-turn tactic novelty signal, improving aggregate empathy by 25.3% over vanilla across 1.7B and 4B models while reducing cross-turn discourse move repetition by 26.3% on the 4B model, surpassing all baselines including quality-only and token-level diversity methods on both measures. These results suggest that what current models lack is not empathy itself, but the ability to vary their discourse moves across a conversation.
多轮共情对话中的话语多样性 / Discourse Diversity in Multi-Turn Empathic Dialogue
这篇论文发现大型语言模型在多轮共情对话中会重复使用固定的话语策略,显得刻板,并提出了一个名为MINT的强化学习训练框架来提升其话语多样性,从而显著改善了共情对话的质量。
源自 arXiv: 2604.11742