识别未知:面向机器人物体交互的免提示开放词汇异常识别 / Identifying the Unknown: Prompt-Free Open Vocabulary Anomaly Recognition for Robot-Object Interaction
1️⃣ 一句话总结
本文提出了一种名为AnomNOVIC的两阶段框架,能让机器人在不需要提前指定物体类别名称(即免提示)的情况下,实时识别并分类它从未见过的物体,在机器人抓取和操作场景中取得了显著优于现有开放词汇检测器的性能。
Robots operating in real-world environments must in general be able to recognize previously unseen objects. As robotic systems move toward open-world autonomy, there is a growing, yet largely unmet, need for open vocabulary object detectors that are prompt-free and efficient enough for continuous deployment. We present AnomNOVIC, a two-stage known-workspace framework that combines a masked autoencoder (MAE) trained for anomaly detection, with NOVIC, a powerful real-time prompt-free open vocabulary image classifier. The MAE produces generic object-agnostic bounding boxes, allowing NOVIC to classify salient image regions without requiring a predefined candidate class list. We evaluate AnomNOVIC against strong open vocabulary baselines in a tabletop robot-object environment featuring the NICOL humanoid robot, reaching 47.1% AP / 57.5% AP50 for prompt-free recognition, and 59.0% AP / 72.5% AP50 if class candidates are provided. Across additional datasets, including an in-the-wild test set with 48 unique objects, AnomNOVIC reaches up to 82.6% prompt-free detection and classification accuracy. These results significantly surpass all tested open vocabulary baselines, including YOLO-World-v2, OWLv2, and YOLOE.
识别未知:面向机器人物体交互的免提示开放词汇异常识别 / Identifying the Unknown: Prompt-Free Open Vocabulary Anomaly Recognition for Robot-Object Interaction
本文提出了一种名为AnomNOVIC的两阶段框架,能让机器人在不需要提前指定物体类别名称(即免提示)的情况下,实时识别并分类它从未见过的物体,在机器人抓取和操作场景中取得了显著优于现有开放词汇检测器的性能。
源自 arXiv: 2606.26829