菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-27
📄 Abstract - WildLIFT: Lifting monocular drone video to 3D for species-agnostic wildlife monitoring

Monocular RGB cameras mounted on drones are widely used for wildlife monitoring, yet most analytical pipelines remain confined to two-dimensional image space, leaving geometric information in video underexploited. We present WildLIFT, a computational framework that integrates three-dimensional scene geometry from monocular drone video with open-vocabulary 2D instance segmentation to enable species-agnostic 3D detection and tracking. Oriented 3D bounding box labels with semantic face information enable quantitative assessment of viewpoint coverage and inter-animal occlusion, producing structured metadata for downstream ecological analyses. We validate the framework on 2,581 manually curated frames comprising over 6,700 3D detections across four large mammal species. WildLIFT maintains high identity consistency in multi-animal scenes and substantially reduces manual 3D annotation effort through keyframe-based refinement. By transforming standard drone footage into structured 3D and viewpoint-aware representations, WildLIFT extends the analytical utility of aerial wildlife datasets for behavioural research and population monitoring.

顶级标签: computer vision data multi-modal
详细标签: 3d detection drone video wildlife monitoring open-vocabulary instance segmentation 或 搜索:

WildLIFT:将单无人机视频提升至三维空间以实现物种无关的野生动物监测 / WildLIFT: Lifting monocular drone video to 3D for species-agnostic wildlife monitoring


1️⃣ 一句话总结

WildLIFT提出了一种新方法,能够从普通的无人机单摄像头视频中自动构建三维场景,并结合智能识别技术,在不依赖特定物种信息的情况下,对多种野生动物的位置和运动进行三维检测与追踪,从而大幅减少人工标注工作,为生态研究和种群监测提供更丰富的立体数据。

源自 arXiv: 2604.24718