目标力:教导视频模型实现物理条件约束的目标 / Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals
1️⃣ 一句话总结
这篇论文提出了一种名为‘目标力’的新框架,它允许用户通过明确的力向量和中间动态过程来定义目标,从而训练视频生成模型理解和模拟物理交互,使其能够在复杂现实场景中实现精确、基于物理的目标规划。
Recent advancements in video generation have enabled the development of ``world models'' capable of simulating potential futures for robotics and planning. However, specifying precise goals for these models remains a challenge; text instructions are often too abstract to capture physical nuances, while target images are frequently infeasible to specify for dynamic tasks. To address this, we introduce Goal Force, a novel framework that allows users to define goals via explicit force vectors and intermediate dynamics, mirroring how humans conceptualize physical tasks. We train a video generation model on a curated dataset of synthetic causal primitives-such as elastic collisions and falling dominos-teaching it to propagate forces through time and space. Despite being trained on simple physics data, our model exhibits remarkable zero-shot generalization to complex, real-world scenarios, including tool manipulation and multi-object causal chains. Our results suggest that by grounding video generation in fundamental physical interactions, models can emerge as implicit neural physics simulators, enabling precise, physics-aware planning without reliance on external engines. We release all datasets, code, model weights, and interactive video demos at our project page.
目标力:教导视频模型实现物理条件约束的目标 / Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals
这篇论文提出了一种名为‘目标力’的新框架,它允许用户通过明确的力向量和中间动态过程来定义目标,从而训练视频生成模型理解和模拟物理交互,使其能够在复杂现实场景中实现精确、基于物理的目标规划。
源自 arXiv: 2601.05848