菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-09
📄 Abstract - From Obstacles to Etiquette: Robot Social Navigation with VLM-Informed Path Selection

Navigating socially in human environments requires more than satisfying geometric constraints, as collision-free paths may still interfere with ongoing activities or conflict with social norms. Addressing this challenge calls for analyzing interactions between agents and incorporating common-sense reasoning into planning. This paper presents a social robot navigation framework that integrates geometric planning with contextual social reasoning. The system first extracts obstacles and human dynamics to generate geometrically feasible candidate paths, then leverages a fine-tuned vision-language model (VLM) to evaluate these paths, informed by contextually grounded social expectations, selecting a socially optimized path for the controller. This task-specific VLM distills social reasoning from large foundation models into a smaller and efficient model, allowing the framework to perform real-time adaptation in diverse human-robot interaction contexts. Experiments in four social navigation contexts demonstrate that our method achieves the best overall performance with the lowest personal space violation duration, the minimal pedestrian-facing time, and no social zone intrusions. Project page: this https URL

顶级标签: robotics multi-modal agents
详细标签: social navigation vision-language model path planning human-robot interaction contextual reasoning 或 搜索:

从障碍到礼仪:基于视觉语言模型路径选择的机器人社会导航 / From Obstacles to Etiquette: Robot Social Navigation with VLM-Informed Path Selection


1️⃣ 一句话总结

这篇论文提出了一种让机器人在人群中移动时更懂‘礼貌’的新方法,它先用传统技术规划几条可行的物理路线,然后借助一个经过专门训练的视觉语言模型,像人一样‘思考’哪条路线最符合社交礼仪,从而选出既安全又不打扰他人的最优路径。

源自 arXiv: 2602.09002