检索增强的高斯化身:改进表情泛化能力 / Retrieval-Augmented Gaussian Avatars: Improving Expression Generalization
1️⃣ 一句话总结
这篇论文提出了一种名为RAF的简单训练增强方法,通过从外部表情库中检索并替换部分训练表情,让无模板的3D头部化身模型能够学习更广泛的表情变化,从而显著提升其表情生成的质量和泛化能力,而无需额外的配对数据或修改模型结构。
Template-free animatable head avatars can achieve high visual fidelity by learning expression-dependent facial deformation directly from a subject's capture, avoiding parametric face templates and hand-designed blendshape spaces. However, since learned deformation is supervised only by the expressions observed for a single identity, these models suffer from limited expression coverage and often struggle when driven by motions that deviate from the training distribution. We introduce RAF (Retrieval-Augmented Faces), a simple training-time augmentation designed for template-free head avatars that learn deformation from data. RAF constructs a large unlabeled expression bank and, during training, replaces a subset of the subject's expression features with nearest-neighbor expressions retrieved from this bank while still reconstructing the subject's original frames. This exposes the deformation field to a broader range of expression conditions, encouraging stronger identity-expression decoupling and improving robustness to expression distribution shift without requiring paired cross-identity data, additional annotations, or architectural changes. We further analyze how retrieval augmentation increases expression diversity and validate retrieval quality with a user study showing that retrieved neighbors are perceptually closer in expression and pose. Experiments on the NeRSemble benchmark demonstrate that RAF consistently improves expression fidelity over the baseline, in both self-driving and cross-driving scenarios.
检索增强的高斯化身:改进表情泛化能力 / Retrieval-Augmented Gaussian Avatars: Improving Expression Generalization
这篇论文提出了一种名为RAF的简单训练增强方法,通过从外部表情库中检索并替换部分训练表情,让无模板的3D头部化身模型能够学习更广泛的表情变化,从而显著提升其表情生成的质量和泛化能力,而无需额外的配对数据或修改模型结构。
源自 arXiv: 2603.08645