菜单

🤖 系统
📄 Abstract - ReDirector: Creating Any-Length Video Retakes with Rotary Camera Encoding

We present ReDirector, a novel camera-controlled video retake generation method for dynamically captured variable-length videos. In particular, we rectify a common misuse of RoPE in previous works by aligning the spatiotemporal positions of the input video and the target retake. Moreover, we introduce Rotary Camera Encoding (RoCE), a camera-conditioned RoPE phase shift that captures and integrates multi-view relationships within and across the input and target videos. By integrating camera conditions into RoPE, our method generalizes to out-of-distribution camera trajectories and video lengths, yielding improved dynamic object localization and static background preservation. Extensive experiments further demonstrate significant improvements in camera controllability, geometric consistency, and video quality across various trajectories and lengths.

顶级标签: video generation computer vision multi-modal
详细标签: camera control video retakes rotary position encoding geometric consistency novel view synthesis 或 搜索:

ReDirector:一种用于动态捕获变长视频的相机控制视频重拍生成方法 / ReDirector: Creating Any-Length Video Retakes with Rotary Camera Encoding


1️⃣ 一句话总结

本文提出了一种名为ReDirector的新方法,通过引入旋转相机编码(RoCE)和几何感知注意力机制,有效解决了现有方法在处理动态相机运动和变长输入视频时几何一致性差、泛化能力弱的问题,实现了高质量、几何一致且相机控制精确的任意长度视频重拍生成。


2️⃣ 论文创新点

1. 旋转相机编码(RoCE)

2. 基于RoPE的时空位置对齐

3. 几何感知注意力机制

4. 训练策略增强


3️⃣ 主要结果与价值

结果亮点

实际价值


4️⃣ 术语表

📄 打开原文 PDF