Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

📄 Abstract - Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

Diffusion-based language models (DLLMs) offer non-sequential, block-wise generation and richer data reuse compared to autoregressive (AR) models, but existing code DLLMs still lag behind strong AR baselines under comparable budgets. We revisit this setting in a controlled study and introduce Stable-DiffCoder, a block diffusion code model that reuses the Seed-Coder architecture, data, and training pipeline. To enable efficient knowledge learning and stable training, we incorporate a block diffusion continual pretraining (CPT) stage enhanced by a tailored warmup and block-wise clipped noise schedule. Under the same data and architecture, Stable-DiffCoder overall outperforms its AR counterpart on a broad suite of code benchmarks. Moreover, relying only on the CPT and supervised fine-tuning stages, Stable-DiffCoder achieves stronger performance than a wide range of \~8B ARs and DLLMs, demonstrating that diffusion-based training can improve code modeling quality beyond AR training alone. Moreover, diffusion-based any-order modeling improves structured code modeling for editing and reasoning, and through data augmentation, benefits low-resource coding languages.

Stable-DiffCoder：推进代码扩散大语言模型的前沿 / Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

1️⃣ 一句话总结

这篇论文提出了一种名为Stable-DiffCoder的新型代码生成模型，它采用创新的块扩散训练方法，在同等计算和数据条件下，其整体性能超越了传统的自回归模型，并且在代码编辑、推理及低资源编程语言任务上表现出额外优势。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要