菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-02
📄 Abstract - Interactive Tracking: A Human-in-the-Loop Paradigm with Memory-Augmented Adaptation

Existing visual trackers mainly operate in a non-interactive, fire-and-forget manner, making them impractical for real-world scenarios that require human-in-the-loop adaptation. To overcome this limitation, we introduce Interactive Tracking, a new paradigm that allows users to guide the tracker at any time using natural language commands. To support research in this direction, we make three main contributions. First, we present InteractTrack, the first large-scale benchmark for interactive tracking, containing 150 videos with dense bounding box annotations and timestamped language instructions. Second, we propose a comprehensive evaluation protocol and evaluate 25 representative trackers, showing that state-of-the-art methods fail in interactive scenarios; strong performance on conventional benchmarks does not transfer. Third, we introduce Interactive Memory-Augmented Tracking (IMAT), a new baseline that employs a dynamic memory mechanism to learn from user feedback and update tracking behavior accordingly. Our benchmark, protocol, and baseline establish a foundation for developing more intelligent, adaptive, and collaborative tracking systems, bridging the gap between automated perception and human guidance. The full benchmark, tracking results, and analysis are available at this https URL.

顶级标签: computer vision benchmark agents
详细标签: interactive tracking visual tracking human-in-the-loop natural language commands memory-augmented adaptation 或 搜索:

交互式跟踪:一种结合人类参与与记忆增强自适应的人机协同新范式 / Interactive Tracking: A Human-in-the-Loop Paradigm with Memory-Augmented Adaptation


1️⃣ 一句话总结

这篇论文提出了一种新的交互式视觉跟踪范式,允许用户通过自然语言指令实时指导跟踪器,并为此创建了一个大规模基准测试、评估协议和一个能通过记忆机制学习用户反馈的新基线方法,旨在让人工智能视觉跟踪系统变得更智能、更适应真实世界的人机协作需求。

源自 arXiv: 2604.01974