菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-27
📄 Abstract - VGGT-SLAM 2.0: Real time Dense Feed-forward Scene Reconstruction

We present VGGT-SLAM 2.0, a real time RGB feed-forward SLAM system which substantially improves upon VGGT-SLAM for incrementally aligning submaps created from VGGT. Firstly, we remove high-dimensional 15-degree-of-freedom drift and planar degeneracy from VGGT-SLAM by creating a new factor graph design while still addressing the reconstruction ambiguity of VGGT given unknown camera intrinsics. Secondly, by studying the attention layers of VGGT, we show that one of the layers is well suited to assist in image retrieval verification for free without additional training, which enables both rejecting false positive matches and allows for completing more loop closures. Finally, we conduct a suite of experiments which includes showing VGGT-SLAM 2.0 can easily be adapted for open-set object detection and demonstrating real time performance while running online onboard a ground robot using a Jetson Thor. We also test in environments ranging from cluttered indoor apartments and office scenes to a 4,200 square foot barn, and we also demonstrate VGGT-SLAM 2.0 achieves the highest accuracy on the TUM dataset with about 23 percent less pose error than VGGT-SLAM. Code will be released upon publication.

顶级标签: computer vision robotics systems
详细标签: visual slam dense reconstruction factor graph real-time attention layers 或 搜索:

VGGT-SLAM 2.0:实时密集前馈场景重建系统 / VGGT-SLAM 2.0: Real time Dense Feed-forward Scene Reconstruction


1️⃣ 一句话总结

这篇论文提出了一个升级版的实时视觉定位与地图构建系统,它通过改进算法设计显著提升了地图的精度和鲁棒性,并在多种复杂环境中实现了高效的在线运行。

源自 arXiv: 2601.19887