LCUDiff:基于潜在容量升级扩散的真实人体图像修复 / LCUDiff: Latent Capacity Upgrade Diffusion for Faithful Human Body Restoration
1️⃣ 一句话总结
这篇论文提出了一种名为LCUDiff的新方法,它通过将预训练扩散模型的潜在空间从4通道升级到16通道,并配合专门的训练策略,显著提升了人体图像修复的真实感和细节质量,同时保持了高效的一步生成速度。
Existing methods for restoring degraded human-centric images often struggle with insufficient fidelity, particularly in human body restoration (HBR). Recent diffusion-based restoration methods commonly adapt pre-trained text-to-image diffusion models, where the variational autoencoder (VAE) can significantly bottleneck restoration fidelity. We propose LCUDiff, a stable one-step framework that upgrades a pre-trained latent diffusion model from the 4-channel latent space to the 16-channel latent space. For VAE fine-tuning, channel splitting distillation (CSD) is used to keep the first four channels aligned with pre-trained priors while allocating the additional channels to effectively encode high-frequency details. We further design prior-preserving adaptation (PPA) to smoothly bridge the mismatch between 4-channel diffusion backbones and the higher-dimensional 16-channel latent. In addition, we propose a decoder router (DeR) for per-sample decoder routing using restoration-quality score annotations, which improves visual quality across diverse conditions. Experiments on synthetic and real-world datasets show competitive results with higher fidelity and fewer artifacts under mild degradations, while preserving one-step efficiency. The code and model will be at this https URL.
LCUDiff:基于潜在容量升级扩散的真实人体图像修复 / LCUDiff: Latent Capacity Upgrade Diffusion for Faithful Human Body Restoration
这篇论文提出了一种名为LCUDiff的新方法,它通过将预训练扩散模型的潜在空间从4通道升级到16通道,并配合专门的训练策略,显著提升了人体图像修复的真实感和细节质量,同时保持了高效的一步生成速度。
源自 arXiv: 2602.04406