菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-27
📄 Abstract - CA-IDD: Cross-Attention Guided Identity-Conditional Diffusion for Identity-Consistent Face Swapping

Face swapping aims to optimize realistic facial image generation by leveraging the identity of a source face onto a target face while preserving pose, expression, and context. However, existing methods, especially GAN-based methods, often struggle to balance identity preservation and visual realism due to limited controllability and mode collapse. In this paper, we introduce CA-IDD (Cross-Attention Guided Identity-Conditional Diffusion), the first diffusion-based face swapping approach that integrates multi-modal guidance comprising gaze, identity, and facial parsing through multi-scale cross-attention. Precomputed identity embeddings are incorporated into the denoising process via hierarchical attention layers, resulting in accurate and consistent identity transfer. To improve semantic coherence and visual quality, we use expert-guided supervision, with facial parsing and gaze-consistency modules. Unlike GAN-based or implicit-fusion methods, our diffusion framework provides stable training, robust generalization, and spatially adaptive identity alignment, allowing for fine-grained regional control across pose and expression variations. CA-IDD achieves an FID of 11.73, exceeding established baselines such as FaceShifter and MegaFS. Qualitative results also reveal improved identity retention across diverse poses, establishing CA-IDD as a strong foundation for future diffusion-based face editing.

顶级标签: computer vision aigc
详细标签: face swapping diffusion model identity preservation cross-attention facial generation 或 搜索:

跨注意力引导的身份条件扩散模型实现身份一致的换脸 / CA-IDD: Cross-Attention Guided Identity-Conditional Diffusion for Identity-Consistent Face Swapping


1️⃣ 一句话总结

本文提出了一种名为CA-IDD的新型换脸方法,首次利用扩散模型结合多尺度跨注意力机制,将源人脸的身份特征精准迁移到目标人脸上,同时保留目标的表情、姿势和背景,在保持身份一致性和图像真实感方面超越了传统的GAN方法。

源自 arXiv: 2604.24493