菜单

🤖 系统
📄 Abstract - Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

Diffusion model distillation has emerged as a powerful technique for creating efficient few-step and single-step generators. Among these, Distribution Matching Distillation (DMD) and its variants stand out for their impressive performance, which is widely attributed to their core mechanism of matching the student's output distribution to that of a pre-trained teacher model. In this work, we challenge this conventional understanding. Through a rigorous decomposition of the DMD training objective, we reveal that in complex tasks like text-to-image generation, where CFG is typically required for desirable few-step performance, the primary driver of few-step distillation is not distribution matching, but a previously overlooked component we identify as CFG Augmentation (CA). We demonstrate that this term acts as the core ``engine'' of distillation, while the Distribution Matching (DM) term functions as a ``regularizer'' that ensures training stability and mitigates artifacts. We further validate this decoupling by demonstrating that while the DM term is a highly effective regularizer, it is not unique; simpler non-parametric constraints or GAN-based objectives can serve the same stabilizing function, albeit with different trade-offs. This decoupling of labor motivates a more principled analysis of the properties of both terms, leading to a more systematic and in-depth understanding. This new understanding further enables us to propose principled modifications to the distillation process, such as decoupling the noise schedules for the engine and the regularizer, leading to further performance gains. Notably, our method has been adopted by the Z-Image ( this https URL ) project to develop a top-tier 8-step image generation model, empirically validating the generalization and robustness of our findings.

顶级标签: model training aigc computer vision
详细标签: diffusion distillation distribution matching cfg augmentation text-to-image model efficiency 或 搜索:

解耦的DMD:以CFG增强为矛,以分布匹配为盾 / Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield


1️⃣ 一句话总结

这篇论文重新审视了扩散模型蒸馏的主流认知,发现其核心驱动力并非传统的分布匹配,而是一个被忽视的“CFG增强”机制,后者才是实现高效少步生成的关键引擎,而分布匹配仅起到稳定训练的辅助作用,这一新理解推动了更优蒸馏方法的开发。


📄 打开原文 PDF