IDLM: Inverse-distilled Diffusion Language Models

📄 Abstract - IDLM: Inverse-distilled Diffusion Language Models

Diffusion Language Models (DLMs) have recently achieved strong results in text generation. However, their multi-step sampling leads to slow inference, limiting practical use. To address this, we extend Inverse Distillation, a technique originally developed to accelerate continuous diffusion models, to the discrete setting. Nonetheless, this extension introduces both theoretical and practical challenges. From a theoretical perspective, the inverse distillation objective lacks uniqueness guarantees, which may lead to suboptimal solutions. From a practical standpoint, backpropagation in the discrete space is non-trivial and often unstable. To overcome these challenges, we first provide a theoretical result demonstrating that our inverse formulation admits a unique solution, thereby ensuring valid optimization. We then introduce gradient-stable relaxations to support effective training. As a result, experiments on multiple DLMs show that our method, Inverse-distilled Diffusion Language Models (IDLM), reduces the number of inference steps by 4x-64x, while preserving the teacher model's entropy and generative perplexity.

IDLM：逆向蒸馏扩散语言模型 / IDLM: Inverse-distilled Diffusion Language Models

1️⃣ 一句话总结

这篇论文提出了一种名为IDLM的新方法，通过将一种名为“逆向蒸馏”的技术应用到文本生成模型中，成功地将扩散语言模型的推理速度提升了4到64倍，同时保持了生成文本的质量。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要