在每个尺度上做每件事:具有连续超分辨能力的尺度不变扩散模型 / Everything at Every Scale: Scale-Invariant Diffusion with Continuous Super-Resolution
1️⃣ 一句话总结
本文提出了一种名为SKILD的单一扩散模型,利用自然图像和物理系统共有的尺度不变性,将图像生成和任意放大倍率的超分辨率任务统一起来,无需任务专用架构或条件训练,即可在多个基准上取得领先效果。
Creating images from noise is image generation; reconstructing fine details from coarse inputs is super-resolution. Despite their practical differences, both can be understood as reversing information loss across scales. We introduce $\textbf{SKILD}$, a $\textbf{S}$cale-invariant $\textbf{K}$-Space $\textbf{I}$mage $\textbf{L}$earning $\textbf{D}$iffusion model that unifies generation and continuous super-resolution within a single unconditional framework. Both natural images and critical physical systems exhibit scale invariance, and we leverage it to design a forward process that attenuates image content from fine to coarse scales while injecting spectrum-matched Gaussian noise, making scale an explicit coordinate of the diffusion dynamics. The same trained reverse process performs generation and continuous super-resolution by varying only the starting timestep: $\textit{no task-specific architecture, no conditioning branch, no classifier-free guidance, no retraining per scale factor}$. Empirically, SKILD reaches FID $2.65$ and Inception Score $9.63$ on unconditional CIFAR-10, performs $2\times$--$8\times$ super-resolution on ImageNet from a single unconditional checkpoint while outperforming conditional models across perceptual metrics, and reconstructs critical Ising models whose connected four-point correlations closely track the ground truth.
在每个尺度上做每件事:具有连续超分辨能力的尺度不变扩散模型 / Everything at Every Scale: Scale-Invariant Diffusion with Continuous Super-Resolution
本文提出了一种名为SKILD的单一扩散模型,利用自然图像和物理系统共有的尺度不变性,将图像生成和任意放大倍率的超分辨率任务统一起来,无需任务专用架构或条件训练,即可在多个基准上取得领先效果。
源自 arXiv: 2605.26032