DriveFix:时空一致的驾驶场景修复 / DriveFix: Spatio-Temporally Coherent Driving Scene Restoration
1️⃣ 一句话总结
这篇论文提出了一种名为DriveFix的新方法,它通过一个独特的网络架构同时考虑时间和多摄像头空间信息,来修复自动驾驶场景中的视频,从而生成在时间和空间上都连贯一致的高质量3D场景,显著优于现有技术。
Recent advancements in 4D scene reconstruction, particularly those leveraging diffusion priors, have shown promise for novel view synthesis in autonomous driving. However, these methods often process frames independently or in a view-by-view manner, leading to a critical lack of spatio-temporal synergy. This results in spatial misalignment across cameras and temporal drift in sequences. We propose DriveFix, a novel multi-view restoration framework that ensures spatio-temporal coherence for driving scenes. Our approach employs an interleaved diffusion transformer architecture with specialized blocks to explicitly model both temporal dependencies and cross-camera spatial consistency. By conditioning the generation on historical context and integrating geometry-aware training losses, DriveFix enforces that the restored views adhere to a unified 3D geometry. This enables the consistent propagation of high-fidelity textures and significantly reduces artifacts. Extensive evaluations on the Waymo, nuScenes, and PandaSet datasets demonstrate that DriveFix achieves state-of-the-art performance in both reconstruction and novel view synthesis, marking a substantial step toward robust 4D world modeling for real-world deployment.
DriveFix:时空一致的驾驶场景修复 / DriveFix: Spatio-Temporally Coherent Driving Scene Restoration
这篇论文提出了一种名为DriveFix的新方法,它通过一个独特的网络架构同时考虑时间和多摄像头空间信息,来修复自动驾驶场景中的视频,从而生成在时间和空间上都连贯一致的高质量3D场景,显著优于现有技术。
源自 arXiv: 2603.16306