菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-02
📄 Abstract - Adaptive Spectral Feature Forecasting for Diffusion Sampling Acceleration

Diffusion models have become the dominant tool for high-fidelity image and video generation, yet are critically bottlenecked by their inference speed due to the numerous iterative passes of Diffusion Transformers. To reduce the exhaustive compute, recent works resort to the feature caching and reusing scheme that skips network evaluations at selected diffusion steps by using cached features in previous steps. However, their preliminary design solely relies on local approximation, causing errors to grow rapidly with large skips and leading to degraded sample quality at high speedups. In this work, we propose spectral diffusion feature forecaster (Spectrum), a training-free approach that enables global, long-range feature reuse with tightly controlled error. In particular, we view the latent features of the denoiser as functions over time and approximate them with Chebyshev polynomials. Specifically, we fit the coefficient for each basis via ridge regression, which is then leveraged to forecast features at multiple future diffusion steps. We theoretically reveal that our approach admits more favorable long-horizon behavior and yields an error bound that does not compound with the step size. Extensive experiments on various state-of-the-art image and video diffusion models consistently verify the superiority of our approach. Notably, we achieve up to 4.79$\times$ speedup on FLUX.1 and 4.67$\times$ speedup on Wan2.1-14B, while maintaining much higher sample quality compared with the baselines.

顶级标签: model training computer vision aigc
详细标签: diffusion models sampling acceleration spectral forecasting feature reuse chebyshev polynomials 或 搜索:

用于扩散采样加速的自适应谱特征预测 / Adaptive Spectral Feature Forecasting for Diffusion Sampling Acceleration


1️⃣ 一句话总结

这篇论文提出了一种名为Spectrum的无训练方法,它通过切比雪夫多项式来预测和重用扩散模型去噪过程中的特征,从而在显著提升图像和视频生成速度的同时,有效控制误差并保持高质量的输出。

源自 arXiv: 2603.01623