菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-17
📄 Abstract - DriveFix: Spatio-Temporally Coherent Driving Scene Restoration

Recent advancements in 4D scene reconstruction, particularly those leveraging diffusion priors, have shown promise for novel view synthesis in autonomous driving. However, these methods often process frames independently or in a view-by-view manner, leading to a critical lack of spatio-temporal synergy. This results in spatial misalignment across cameras and temporal drift in sequences. We propose DriveFix, a novel multi-view restoration framework that ensures spatio-temporal coherence for driving scenes. Our approach employs an interleaved diffusion transformer architecture with specialized blocks to explicitly model both temporal dependencies and cross-camera spatial consistency. By conditioning the generation on historical context and integrating geometry-aware training losses, DriveFix enforces that the restored views adhere to a unified 3D geometry. This enables the consistent propagation of high-fidelity textures and significantly reduces artifacts. Extensive evaluations on the Waymo, nuScenes, and PandaSet datasets demonstrate that DriveFix achieves state-of-the-art performance in both reconstruction and novel view synthesis, marking a substantial step toward robust 4D world modeling for real-world deployment.

顶级标签: computer vision multi-modal systems
详细标签: 4d scene reconstruction novel view synthesis autonomous driving spatio-temporal coherence diffusion transformer 或 搜索:

DriveFix:时空一致的驾驶场景修复 / DriveFix: Spatio-Temporally Coherent Driving Scene Restoration


1️⃣ 一句话总结

这篇论文提出了一种名为DriveFix的新方法,它通过一个独特的网络架构同时考虑时间和多摄像头空间信息,来修复自动驾驶场景中的视频,从而生成在时间和空间上都连贯一致的高质量3D场景,显著优于现有技术。

源自 arXiv: 2603.16306