用于无人机应用的基于Transformer的视觉目标跟踪架构与评估协议 / Architecture and evaluation protocol for transformer-based visual object tracking in UAV applications
1️⃣ 一句话总结
这篇论文为无人机视觉目标跟踪提出了一套新方案,它结合了Transformer模型和运动补偿技术来提升跟踪的鲁棒性,同时还设计了一个更贴近嵌入式设备真实性能的评估标准来验证其有效性。
Object tracking from Unmanned Aerial Vehicles (UAVs) is challenged by platform dynamics, camera motion, and limited onboard resources. Existing visual trackers either lack robustness in complex scenarios or are too computationally demanding for real-time embedded use. We propose an Modular Asynchronous Tracking Architecture (MATA) that combines a transformer-based tracker with an Extended Kalman Filter, integrating ego-motion compensation from sparse optical flow and an object trajectory model. We further introduce a hardware-independent, embedded oriented evaluation protocol and a new metric called Normalized time to Failure (NT2F) to quantify how long a tracker can sustain a tracking sequence without external help. Experiments on UAV benchmarks, including an augmented UAV123 dataset with synthetic occlusions, show consistent improvements in Success and NT2F metrics across multiple tracking processing frequency. A ROS 2 implementation on a Nvidia Jetson AGX Orin confirms that the evaluation protocol more closely matches real-time performance on embedded systems.
用于无人机应用的基于Transformer的视觉目标跟踪架构与评估协议 / Architecture and evaluation protocol for transformer-based visual object tracking in UAV applications
这篇论文为无人机视觉目标跟踪提出了一套新方案,它结合了Transformer模型和运动补偿技术来提升跟踪的鲁棒性,同时还设计了一个更贴近嵌入式设备真实性能的评估标准来验证其有效性。
源自 arXiv: 2603.03904