菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-31
📄 Abstract - Benchmarking Interaction, Beyond Policy: a Reproducible Benchmark for Collaborative Instance Object Navigation

We propose Question-Asking Navigation (QAsk-Nav), the first reproducible benchmark for Collaborative Instance Object Navigation (CoIN) that enables an explicit, separate assessment of embodied navigation and collaborative question asking. CoIN tasks an embodied agent with reaching a target specified in free-form natural language under partial observability, using only egocentric visual observations and interactive natural-language dialogue with a human, where the dialogue can help to resolve ambiguity among visually similar object instances. Existing CoIN benchmarks are primarily focused on navigation success and offer no support for consistent evaluation of collaborative interaction. To address this limitation, QAsk-Nav provides (i) a lightweight question-asking protocol scored independently of navigation, (ii) an enhanced navigation protocol with realistic, diverse, high-quality target descriptions, and (iii) an open-source dataset, that includes 28,000 quality-checked reasoning and question-asking traces for training and analysis of interactive capabilities of CoIN models. Using the proposed QAsk-Nav benchmark, we develop Light-CoNav, a lightweight unified model for collaborative navigation that is 3x smaller and 70x faster than existing modular methods, while outperforming state-of-the-art CoIN approaches in generalization to unseen objects and environments. Project page at this https URL

顶级标签: agents benchmark natural language processing
详细标签: embodied ai collaborative navigation human-agent interaction evaluation framework visual-language navigation 或 搜索:

超越策略的交互基准测试:一个用于协作实例目标导航的可复现基准 / Benchmarking Interaction, Beyond Policy: a Reproducible Benchmark for Collaborative Instance Object Navigation


1️⃣ 一句话总结

这篇论文提出了首个可复现的协作实例目标导航基准QAsk-Nav,它能独立评估导航和协作提问能力,并基于此开发了一个更小、更快、泛化能力更强的轻量级统一导航模型。

源自 arXiv: 2604.00265