菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-01
📄 Abstract - NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

In this paper, we propose NeoVerse, a versatile 4D world model that is capable of 4D reconstruction, novel-trajectory video generation, and rich downstream applications. We first identify a common limitation of scalability in current 4D world modeling methods, caused either by expensive and specialized multi-view 4D data or by cumbersome training pre-processing. In contrast, our NeoVerse is built upon a core philosophy that makes the full pipeline scalable to diverse in-the-wild monocular videos. Specifically, NeoVerse features pose-free feed-forward 4D reconstruction, online monocular degradation pattern simulation, and other well-aligned techniques. These designs empower NeoVerse with versatility and generalization to various domains. Meanwhile, NeoVerse achieves state-of-the-art performance in standard reconstruction and generation benchmarks. Our project page is available at this https URL

顶级标签: computer vision multi-modal video generation
详细标签: 4d reconstruction world model monocular video novel view synthesis video generation 或 搜索:

NeoVerse:利用真实世界单目视频增强的4D世界模型 / NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos


1️⃣ 一句话总结

这篇论文提出了一个名为NeoVerse的新型4D世界模型,它能够仅使用网络上随手可得的普通单镜头视频,就能高效地重建动态三维场景并生成新视角视频,解决了以往方法对昂贵专业数据或复杂预处理的依赖,在多个任务上达到了领先水平。

源自 arXiv: 2601.00393