Anchoring and Rescaling Attention for Semantically Coherent Inbetweening

📄 Abstract - Anchoring and Rescaling Attention for Semantically Coherent Inbetweening

Generative inbetweening (GI) seeks to synthesize realistic intermediate frames between the first and last keyframes beyond mere interpolation. As sequences become sparser and motions larger, previous GI models struggle with inconsistent frames with unstable pacing and semantic misalignment. Since GI involves fixed endpoints and numerous plausible paths, this task requires additional guidance gained from the keyframes and text to specify the intended path. Thus, we give semantic and temporal guidance from the keyframes and text onto each intermediate frame through Keyframe-anchored Attention Bias. We also better enforce frame consistency with Rescaled Temporal RoPE, which allows self-attention to attend to keyframes more faithfully. TGI-Bench, the first benchmark specifically designed for text-conditioned GI evaluation, enables challenge-targeted evaluation to analyze GI models. Without additional training, our method achieves state-of-the-art frame consistency, semantic fidelity, and pace stability for both short and long sequences across diverse challenges.

基于锚定与重缩放注意力的语义连贯中间帧生成 / Anchoring and Rescaling Attention for Semantically Coherent Inbetweening

1️⃣ 一句话总结

这篇论文提出了一种新方法，通过锚定关键帧注意力和重缩放时间编码，在给定首尾关键帧和文本描述的条件下，生成了语义更连贯、节奏更稳定的动画中间帧，并在专门构建的评测基准上取得了最佳效果。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要