Cross-Modal Navigation with Multi-Agent Reinforcement Learning

📄 Abstract - Cross-Modal Navigation with Multi-Agent Reinforcement Learning

Robust embodied navigation relies on complementary sensory cues. However, high-quality and well-aligned multi-modal data is often difficult to obtain in practice. Training a monolithic model is also challenging as rich multi-modal inputs induce complex representations and substantially enlarge the policy space. Cross-modal collaboration among lightweight modality-specialized agents offers a scalable paradigm. It enables flexible deployment and parallel execution, while preserving the strength of each modality. In this paper, we propose \textbf{CRONA}, a Multi-Agent Reinforcement Learning (MARL) framework for \textbf{Cro}ss-Modal \textbf{Na}vigation. CRONA improves collaboration by leveraging control-relevant auxiliary beliefs and a centralized multi-modal critic with global state. Experiments on visual-acoustic navigation tasks show that multi-agent methods significantly improve performance and efficiency over single-agent baselines. We find that homogeneous collaboration with limited modalities is sufficient for short-range navigation under salient cues; heterogeneous collaboration among agents with complementary modalities is generally efficient and effective; and navigation in large, complex environments requires both richer multi-modal perception and increased model capacity.

基于多智能体强化学习的跨模态导航 / Cross-Modal Navigation with Multi-Agent Reinforcement Learning

1️⃣ 一句话总结

本文提出了一种名为CRONA的多智能体强化学习框架，通过让多个分别处理不同感官信息（如视觉和听觉）的轻量级智能体协同合作，有效解决了机器人导航中多模态数据难以获取、模型复杂度过高的问题，实验表明该方法在性能和效率上均优于单一模型，并揭示了不同场景下智能体协作的最佳方式。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要