菜单

关于 🐙 GitHub
arXiv 提交日期: 2025-12-31
📄 Abstract - Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow

Generative video modeling has emerged as a compelling tool to zero-shot reason about plausible physical interactions for open-world manipulation. Yet, it remains a challenge to translate such human-led motions into the low-level actions demanded by robotic systems. We observe that given an initial image and task instruction, these models excel at synthesizing sensible object motions. Thus, we introduce Dream2Flow, a framework that bridges video generation and robotic control through 3D object flow as an intermediate representation. Our method reconstructs 3D object motions from generated videos and formulates manipulation as object trajectory tracking. By separating the state changes from the actuators that realize those changes, Dream2Flow overcomes the embodiment gap and enables zero-shot guidance from pre-trained video models to manipulate objects of diverse categories-including rigid, articulated, deformable, and granular. Through trajectory optimization or reinforcement learning, Dream2Flow converts reconstructed 3D object flow into executable low-level commands without task-specific demonstrations. Simulation and real-world experiments highlight 3D object flow as a general and scalable interface for adapting video generation models to open-world robotic manipulation. Videos and visualizations are available at this https URL.

顶级标签: robotics video generation multi-modal
详细标签: 3d object flow open-world manipulation trajectory optimization zero-shot learning video-to-action 或 搜索:

Dream2Flow:通过3D物体流连接视频生成与开放世界操作 / Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow


1️⃣ 一句话总结

这篇论文提出了一个名为Dream2Flow的框架,它通过从AI生成的视频中提取3D物体运动轨迹,并将其转化为机器人可执行的指令,从而让机器人能在没有专门训练的情况下,完成对各类物体的零样本操作。

源自 arXiv: 2512.24766