菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-09
📄 Abstract - The Neural Compass: Probabilistic Relative Feature Fields for Robotic Search

Object co-occurrences provide a key cue for finding objects successfully and efficiently in unfamiliar environments. Typically, one looks for cups in kitchens and views fridges as evidence of being in a kitchen. Such priors have also been exploited in artificial agents, but they are typically learned from explicitly labeled data or queried from language models. It is still unclear whether these relations can be learned implicitly from unlabeled observations alone. In this work, we address this problem and propose ProReFF, a feature field model trained to predict relative distributions of features obtained from pre-trained vision language models. In addition, we introduce a learning-based strategy that enables training from unlabeled and potentially contradictory data by aligning inconsistent observations into a coherent relative distribution. For the downstream object search task, we propose an agent that leverages predicted feature distributions as a semantic prior to guide exploration toward regions with a high likelihood of containing the object. We present extensive evaluations demonstrating that ProReFF captures meaningful relative feature distributions in natural scenes and provides insight into the impact of our proposed alignment step. We further evaluate the performance of our search agent in 100 challenges in the Matterport3D simulator, comparing with feature-based baselines and human participants. The proposed agent is 20% more efficient than the strongest baseline and achieves up to 80% of human performance.

顶级标签: robotics agents computer vision
详细标签: semantic search feature fields probabilistic modeling object co-occurrence vision-language models 或 搜索:

神经罗盘:用于机器人搜索的概率相对特征场 / The Neural Compass: Probabilistic Relative Feature Fields for Robotic Search


1️⃣ 一句话总结

这篇论文提出了一种名为ProReFF的新模型,它能够仅从无标签的视觉观察中自动学习物体之间的空间关联规律(比如‘冰箱附近很可能有杯子’),并利用这些学到的‘常识’来显著提升机器人在陌生环境中搜索目标物体的效率和成功率。

源自 arXiv: 2603.08544