菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-29
📄 Abstract - AirZoo: A Unified Large-Scale Dataset for Grounding Aerial Geometric 3D Vision

Despite the rapid progress in data-driven 3D vision, aerial geometric 3D vision remains a formidable challenge due to the severe scarcity of large-scale, high-fidelity training data. Existing benchmarks, predominantly biased toward ground-level or object-centric views, do not account for complex viewpoint transformations and diverse environmental conditions in UAV-based sensing. To bridge this critical gap, we propose AirZoo, a unified large-scale dataset and benchmark for grounding aerial geometric 3D vision. AirZoo possesses three appealing properties: 1) Scalable Generation Pipeline: Leveraging freely available, world-scale photogrammetric 3D meshes, it renders vast outdoor environments with customizable UAV flight trajectories and configurable weather/illumination. 2) Comprehensive Scene Diversity: It provides the most extensive coverage of region types to date (spanning 378 regions across 22 countries), systematically encompassing both highly structured urban landscapes and complex unstructured natural environments. 3) Rich Geometric Annotations: Each frame provides synchronized, pixel-level metric depth and precise 6-DoF geo-referenced poses, essential for geometry-aware learning. Through three rigorous evaluation tracks -- aerial image retrieval, cross-view matching, and multi-view 3D reconstruction -- we demonstrate that AirZoo serves as a powerful pre-training engine. Extensive experiments on both public and newly collected real-world benchmarks reveal that fine-tuning on AirZoo yields substantial performance gains for SoTA models (e.g., MegaLoc, RoMa, VGGT, and Depth Anything 3), establishing a new performance upper bound for aerial spatial intelligence.

顶级标签: computer vision data benchmark
详细标签: aerial 3d vision dataset uav geometric annotation pre-training 或 搜索:

AirZoo:面向航空几何三维视觉的统一大规模数据集 / AirZoo: A Unified Large-Scale Dataset for Grounding Aerial Geometric 3D Vision


1️⃣ 一句话总结

AirZoo是一个覆盖22个国家、378个区域的大型无人机航拍数据集,通过自动渲染真实三维模型生成带有深度和位置标注的图像,能显著提升现有三维视觉模型在航空场景下的性能,为飞行器空间智能研究提供了关键训练资源。

源自 arXiv: 2604.26567