X2HDR:在感知均匀空间中进行高动态范围图像生成 / X2HDR: HDR Image Generation in a Perceptually Uniform Space
1️⃣ 一句话总结
这篇论文提出了一种无需从头训练、高效适配现有扩散模型的方法,通过在感知均匀空间中进行微调,成功实现了从文本生成高动态范围图像以及从单张RAW图像重建高动态范围图像,显著提升了生成图像的质量和真实感。
High-dynamic-range (HDR) formats and displays are becoming increasingly prevalent, yet state-of-the-art image generators (e.g., Stable Diffusion and FLUX) typically remain limited to low-dynamic-range (LDR) output due to the lack of large-scale HDR training data. In this work, we show that existing pretrained diffusion models can be easily adapted to HDR generation without retraining from scratch. A key challenge is that HDR images are natively represented in linear RGB, whose intensity and color statistics differ substantially from those of sRGB-encoded LDR images. This gap, however, can be effectively bridged by converting HDR inputs into perceptually uniform encodings (e.g., using PU21 or PQ). Empirically, we find that LDR-pretrained variational autoencoders (VAEs) reconstruct PU21-encoded HDR inputs with fidelity comparable to LDR data, whereas linear RGB inputs cause severe degradations. Motivated by this finding, we describe an efficient adaptation strategy that freezes the VAE and finetunes only the denoiser via low-rank adaptation in a perceptually uniform space. This results in a unified computational method that supports both text-to-HDR synthesis and single-image RAW-to-HDR reconstruction. Experiments demonstrate that our perceptually encoded adaptation consistently improves perceptual fidelity, text-image alignment, and effective dynamic range, relative to previous techniques.
X2HDR:在感知均匀空间中进行高动态范围图像生成 / X2HDR: HDR Image Generation in a Perceptually Uniform Space
这篇论文提出了一种无需从头训练、高效适配现有扩散模型的方法,通过在感知均匀空间中进行微调,成功实现了从文本生成高动态范围图像以及从单张RAW图像重建高动态范围图像,显著提升了生成图像的质量和真实感。
源自 arXiv: 2602.04814