子弹时间:用于视频生成的时空解耦控制框架 / BulletTime: Decoupled Control of Time and Camera Pose for Video Generation
1️⃣ 一句话总结
这篇论文提出了一个名为‘子弹时间’的新框架,它能够像独立调节时间和摄像机视角一样,分别精确控制生成视频中场景的动态变化和拍摄角度,从而解决了现有视频生成模型难以实现精细时空控制的问题。
Emerging video diffusion models achieve high visual fidelity but fundamentally couple scene dynamics with camera motion, limiting their ability to provide precise spatial and temporal control. We introduce a 4D-controllable video diffusion framework that explicitly decouples scene dynamics from camera pose, enabling fine-grained manipulation of both scene dynamics and camera viewpoint. Our framework takes continuous world-time sequences and camera trajectories as conditioning inputs, injecting them into the video diffusion model through a 4D positional encoding in the attention layer and adaptive normalizations for feature modulation. To train this model, we curate a unique dataset in which temporal and camera variations are independently parameterized; this dataset will be made public. Experiments show that our model achieves robust real-world 4D control across diverse timing patterns and camera trajectories, while preserving high generation quality and outperforming prior work in controllability. See our website for video results: this https URL
子弹时间:用于视频生成的时空解耦控制框架 / BulletTime: Decoupled Control of Time and Camera Pose for Video Generation
这篇论文提出了一个名为‘子弹时间’的新框架,它能够像独立调节时间和摄像机视角一样,分别精确控制生成视频中场景的动态变化和拍摄角度,从而解决了现有视频生成模型难以实现精细时空控制的问题。