菜单

关于 🐙 GitHub
arXiv 提交日期: 2025-12-31
📄 Abstract - SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time

We present SpaceTimePilot, a video diffusion model that disentangles space and time for controllable generative rendering. Given a monocular video, SpaceTimePilot can independently alter the camera viewpoint and the motion sequence within the generative process, re-rendering the scene for continuous and arbitrary exploration across space and time. To achieve this, we introduce an effective animation time-embedding mechanism in the diffusion process, allowing explicit control of the output video's motion sequence with respect to that of the source video. As no datasets provide paired videos of the same dynamic scene with continuous temporal variations, we propose a simple yet effective temporal-warping training scheme that repurposes existing multi-view datasets to mimic temporal differences. This strategy effectively supervises the model to learn temporal control and achieve robust space-time disentanglement. To further enhance the precision of dual control, we introduce two additional components: an improved camera-conditioning mechanism that allows altering the camera from the first frame, and CamxTime, the first synthetic space-and-time full-coverage rendering dataset that provides fully free space-time video trajectories within a scene. Joint training on the temporal-warping scheme and the CamxTime dataset yields more precise temporal control. We evaluate SpaceTimePilot on both real-world and synthetic data, demonstrating clear space-time disentanglement and strong results compared to prior work. Project page: this https URL Code: this https URL

顶级标签: computer vision video generation aigc
详细标签: video diffusion spatiotemporal disentanglement camera control temporal editing generative rendering 或 搜索:

时空导航者:跨时空动态场景的生成式渲染 / SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time


1️⃣ 一句话总结

这篇论文提出了一个名为SpaceTimePilot的视频生成模型,它能够将视频中的空间(摄像机视角)和时间(物体运动)分开控制,从而让用户自由地改变视频的拍摄角度和动作序列,实现动态场景的灵活再创作。

源自 arXiv: 2512.25075