EverAnimate:基于潜在流恢复的分级长时间人类动画生成 / EverAnimate: Minute-Scale Human Animation via Latent Flow Restoration
1️⃣ 一句话总结
该论文提出一种轻量级后训练方法,通过维护一个持续的记忆上下文来恢复动画过程中因长序列生成而导致的画面质量与角色身份漂移,从而高效生成长达90秒、人景一致且清晰流畅的人类动画。
We propose EverAnimate, an efficient post-training method for long-horizon animated video generation that preserves visual quality and character identity. Long-form animation remains challenging because highly dynamic human motion must be synthesized against relatively static environments, making chunk-based generation prone to accumulated drift: (i) low-level quality drift, such as progressive degradation of static backgrounds, and (ii) high-level semantic drift, such as inconsistent character identity and view-dependent attributes. To address this issue, EverAnimate restores drifted flow trajectories by anchoring generation to a persistent latent context memory, consisting of two complementary mechanisms. (i) Persistent Latent Propagation maintains a context memory across chunks to propagate identity and motion in latent space while mitigating temporal forgetting. (ii) Restorative Flow Matching introduces an implicit restoration objective during sampling through velocity adjustment, improving within-chunk fidelity. With only lightweight LoRA tuning, EverAnimate outperforms state-of-the-art long-animation methods in both short- and long-horizon settings: at 10 seconds, it improves PSNR/SSIM by 8%/7% and reduces LPIPS/FID by 22%/11%; at 90 seconds, the gains increase to 15%/15% and 32%/27%, respectively.
EverAnimate:基于潜在流恢复的分级长时间人类动画生成 / EverAnimate: Minute-Scale Human Animation via Latent Flow Restoration
该论文提出一种轻量级后训练方法,通过维护一个持续的记忆上下文来恢复动画过程中因长序列生成而导致的画面质量与角色身份漂移,从而高效生成长达90秒、人景一致且清晰流畅的人类动画。
源自 arXiv: 2605.15042