AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics

📄 Abstract - AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics

Video generation models internalize physical realism as their prior. Anime deliberately violates physics: smears, impact frames, chibi shifts; and its thousands of coexisting artistic conventions yield no single "physics of anime" a model can absorb. Physics-biased models therefore flatten the artistry that defines the medium or collapse under its stylistic variance. We present AniMatrix, a video generation model that targets artistic rather than physical correctness through a dual-channel conditioning mechanism and a three-step transition: redefine correctness, override the physics prior, and distinguish art from failure. First, a Production Knowledge System encodes anime as a structured taxonomy of controllable production variables (Style, Motion, Camera, VFX), and AniCaption infers these variables from pixels as directorial directives. A trainable tag encoder preserves the field-value structure of this taxonomy while a frozen T5 encoder handles free-form narrative; dual-path injection (cross-attention for fine-grained control, AdaLN modulation for global enforcement) ensures categorical directives are never diluted by open-ended text. Second, a style-motion-deformation curriculum transitions the model from near-physical motion to full anime expressiveness. Third, deformation-aware preference optimization with a domain-specific reward model separates intentional artistry from pathological collapse. On an anime-specific human evaluation with five production dimensions scored by professional animators, AniMatrix ranks first on four of five, with the largest gains over Seedance-Pro 1.0 on Prompt Understanding (+0.70, +22.4 percent) and Artistic Motion (+0.55, +16.9 percent). We are preparing accompanying resources for public release to support reproducibility and follow-up research.

AniMatrix：一种思考艺术而非物理的动漫视频生成模型 / AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics

1️⃣ 一句话总结

针对现有视频生成模型因依赖物理常识而无法表现动漫中故意违反物理规律的艺术手法的问题，AniMatrix 通过双通道控制机制和三个关键步骤（用结构化知识系统重新定义“正确”、用课程学习覆盖物理先验、用偏好优化区分艺术与错误），让模型学会生成符合动漫艺术规则而非物理规则的视频，在专业动画师的评估中显著优于现有方法。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要