现在该由谁主导解码?追踪可靠路径以实现掩码扩散语言模型的集成 / Who Should Lead Decoding Now? Tracking Reliable Trajectories for Ensembling Masked Diffusion Language Models
1️⃣ 一句话总结
本文提出了一种名为TIE的集成方法,通过动态追踪掩码扩散语言模型在解码过程中的置信度变化,选择当前最可靠的模型路径进行接力生成,从而融合多个模型的知识,显著提升复杂推理任务的性能。
Masked Diffusion Language Models (MDLMs) have emerged as a distinct paradigm for sequence generation. As MDLMs become diverse in capabilities and knowledge coverage, an important question is how to combine their knowledge. Toward this, we first investigate the unique decoding dynamics of MDLMs. We find that successful generations exhibit stable confidence dynamics over answer-relevant positions, while unreliable trajectories can often be corrected by injecting promising intermediate states from other models. Guided by this observation, we propose $\textbf{TIE}$ ($\textbf{T}$rajectory-based $\textbf{I}$terative $\textbf{E}$nsembling), a knowledge fusion framework in which MDLMs iteratively identify reliable decoding trajectories and relay them across models. TIE tracks confidence dynamics over answer-relevant positions to determine which model currently follows a more reliable trajectory and selectively transfers partially denoised sequences across models. As the model on the more promising trajectory often changes across denoising steps, TIE allows different models to contribute complementary strengths at different stages of generation. Strong performance across diverse reasoning tasks, along with our analyses, suggests that TIE offers a practical approach to the underexplored problem of MDLM ensembling.
现在该由谁主导解码?追踪可靠路径以实现掩码扩散语言模型的集成 / Who Should Lead Decoding Now? Tracking Reliable Trajectories for Ensembling Masked Diffusion Language Models
本文提出了一种名为TIE的集成方法,通过动态追踪掩码扩散语言模型在解码过程中的置信度变化,选择当前最可靠的模型路径进行接力生成,从而融合多个模型的知识,显著提升复杂推理任务的性能。
源自 arXiv: 2606.16281