菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-08
📄 Abstract - Plenoptic Video Generation

Camera-controlled generative video re-rendering methods, such as ReCamMaster, have achieved remarkable progress. However, despite their success in single-view setting, these works often struggle to maintain consistency across multi-view scenarios. Ensuring spatio-temporal coherence in hallucinated regions remains challenging due to the inherent stochasticity of generative models. To address it, we introduce PlenopticDreamer, a framework that synchronizes generative hallucinations to maintain spatio-temporal memory. The core idea is to train a multi-in-single-out video-conditioned model in an autoregressive manner, aided by a camera-guided video retrieval strategy that adaptively selects salient videos from previous generations as conditional inputs. In addition, Our training incorporates progressive context-scaling to improve convergence, self-conditioning to enhance robustness against long-range visual degradation caused by error accumulation, and a long-video conditioning mechanism to support extended video generation. Extensive experiments on the Basic and Agibot benchmarks demonstrate that PlenopticDreamer achieves state-of-the-art video re-rendering, delivering superior view synchronization, high-fidelity visuals, accurate camera control, and diverse view transformations (e.g., third-person to third-person, and head-view to gripper-view in robotic manipulation). Project page: this https URL

顶级标签: video generation multi-modal computer vision
详细标签: plenoptic video multi-view consistency video re-rendering spatio-temporal coherence autoregressive generation 或 搜索:

全光视频生成 / Plenoptic Video Generation


1️⃣ 一句话总结

本文提出了一种名为PlenopticDreamer的新框架,它通过同步生成过程中的‘幻觉’内容来保持时空一致性,从而解决了现有方法在多视角视频生成中画面不连贯的难题,实现了高质量、可控且视角多样的视频重渲染。

源自 arXiv: 2601.05239