📄
Abstract - SpectralDiT: Timestep-Conditioned Spectral Residual Correction for Flow-Matching DiTs
We propose SpectralDiT, a lightweight modification to flow-matching Diffusion Transformers that adds timestep-conditioned spectral correction to the MLP residual branch. The module decomposes each residual update into low- and high-frequency components on the patch-token grid, then learns a zero-initialized additive gate so the model initially matches the baseline DiT. On CIFAR-10 pixel-space generation, SpectralDiT improves FID from 20.78 to 19.71 at patch size 1 and reduces the radial Fourier spectrum gap. Furthermore, we scale our method to latent diffusion on ImageNet-100. With 0.6% additional theoretical FLOPs and 1.36% additional parameters, SpectralDiT improves latent flow-matching, achieving an 8.7% relative FID reduction under classifier-free guidance (CFG 2.0). All reported results are averaged over five seeds. Ablations and gate visualizations on CIFAR-10 reveal stable block-specific spectral correction patterns.
SpectralDiT:面向流匹配扩散Transformer的时序条件频谱残差校正 /
SpectralDiT: Timestep-Conditioned Spectral Residual Correction for Flow-Matching DiTs
1️⃣ 一句话总结
提出了一种轻量级插件SpectralDiT,通过为扩散Transformer的残差分支添加时序条件化的频谱校正模块,在仅增加极少量计算和参数的情况下,显著提升了图像生成质量,并在CIFAR-10和ImageNet-100上验证了效果。