📄 论文总结
图像扩散模型中的局部性源于数据统计特性 / Locality in Image Diffusion Models Emerges from Data Statistics
1️⃣ 一句话总结
这篇论文通过理论和实验证明,图像扩散模型在处理像素时表现出的局部依赖特性,主要源于图像数据本身的统计相关性,而非卷积神经网络的固有设计偏好。
Recent work has shown that the generalization ability of image diffusion models arises from the locality properties of the trained neural network. In particular, when denoising a particular pixel, the model relies on a limited neighborhood of the input image around that pixel, which, according to the previous work, is tightly related to the ability of these models to produce novel images. Since locality is central to generalization, it is crucial to understand why diffusion models learn local behavior in the first place, as well as the factors that govern the properties of locality patterns. In this work, we present evidence that the locality in deep diffusion models emerges as a statistical property of the image dataset and is not due to the inductive bias of convolutional neural networks, as suggested in previous work. Specifically, we demonstrate that an optimal parametric linear denoiser exhibits similar locality properties to deep neural denoisers. We show, both theoretically and experimentally, that this locality arises directly from pixel correlations present in the image datasets. Moreover, locality patterns are drastically different on specialized datasets, approximating principal components of the data's covariance. We use these insights to craft an analytical denoiser that better matches scores predicted by a deep diffusion model than prior expert-crafted alternatives. Our key takeaway is that while neural network architectures influence generation quality, their primary role is to capture locality patterns inherent in the data.
图像扩散模型中的局部性源于数据统计特性 / Locality in Image Diffusion Models Emerges from Data Statistics
这篇论文通过理论和实验证明,图像扩散模型在处理像素时表现出的局部依赖特性,主要源于图像数据本身的统计相关性,而非卷积神经网络的固有设计偏好。