菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-20
📄 Abstract - Diffusion In Diffusion: Reclaiming Global Coherence in Semi-Autoregressive Diffusion

One of the most compelling features of global discrete diffusion language models is their global bidirectional contextual capability. However, existing block-based diffusion studies tend to introduce autoregressive priors, which, while offering benefits, can cause models to lose this global coherence at the macro level. To regain global contextual understanding while preserving the advantages of the semi-autoregressive paradigm, we propose Diffusion in Diffusion, a 'draft-then-refine' framework designed to overcome the irreversibility and myopia problems inherent in block diffusion models. Our approach first employs block diffusion to generate rapid drafts using small blocks, then refines these drafts through global bidirectional diffusion with a larger bidirectional receptive field. We utilize snapshot confidence remasking to identify the most critical tokens that require modification, and apply mix-scale training to expand the block diffusion model's global capabilities. Empirical results demonstrate that our approach sets a new benchmark for discrete diffusion models on the OpenWebText dataset. Using only 26% of the fine-tuning budget of baseline models, we reduce generative perplexity from 25.7 to 21.9, significantly narrowing the performance gap with autoregressive models.

顶级标签: natural language processing model training machine learning
详细标签: discrete diffusion language modeling semi-autoregressive global coherence text generation 或 搜索:

扩散中的扩散:在半自回归扩散模型中重获全局连贯性 / Diffusion In Diffusion: Reclaiming Global Coherence in Semi-Autoregressive Diffusion


1️⃣ 一句话总结

这篇论文提出了一种名为‘扩散中的扩散’的两阶段文本生成框架,它先通过小块扩散快速生成草稿,再利用全局双向扩散进行精细优化,从而在保持生成效率的同时,显著提升了文本的全局连贯性,并大幅缩小了与主流自回归模型之间的性能差距。

源自 arXiv: 2601.13599