跨注意力引导的身份条件扩散模型实现身份一致的换脸 / CA-IDD: Cross-Attention Guided Identity-Conditional Diffusion for Identity-Consistent Face Swapping
1️⃣ 一句话总结
本文提出了一种名为CA-IDD的新型换脸方法,首次利用扩散模型结合多尺度跨注意力机制,将源人脸的身份特征精准迁移到目标人脸上,同时保留目标的表情、姿势和背景,在保持身份一致性和图像真实感方面超越了传统的GAN方法。
Face swapping aims to optimize realistic facial image generation by leveraging the identity of a source face onto a target face while preserving pose, expression, and context. However, existing methods, especially GAN-based methods, often struggle to balance identity preservation and visual realism due to limited controllability and mode collapse. In this paper, we introduce CA-IDD (Cross-Attention Guided Identity-Conditional Diffusion), the first diffusion-based face swapping approach that integrates multi-modal guidance comprising gaze, identity, and facial parsing through multi-scale cross-attention. Precomputed identity embeddings are incorporated into the denoising process via hierarchical attention layers, resulting in accurate and consistent identity transfer. To improve semantic coherence and visual quality, we use expert-guided supervision, with facial parsing and gaze-consistency modules. Unlike GAN-based or implicit-fusion methods, our diffusion framework provides stable training, robust generalization, and spatially adaptive identity alignment, allowing for fine-grained regional control across pose and expression variations. CA-IDD achieves an FID of 11.73, exceeding established baselines such as FaceShifter and MegaFS. Qualitative results also reveal improved identity retention across diverse poses, establishing CA-IDD as a strong foundation for future diffusion-based face editing.
跨注意力引导的身份条件扩散模型实现身份一致的换脸 / CA-IDD: Cross-Attention Guided Identity-Conditional Diffusion for Identity-Consistent Face Swapping
本文提出了一种名为CA-IDD的新型换脸方法,首次利用扩散模型结合多尺度跨注意力机制,将源人脸的身份特征精准迁移到目标人脸上,同时保留目标的表情、姿势和背景,在保持身份一致性和图像真实感方面超越了传统的GAN方法。
源自 arXiv: 2604.24493