菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-11
📄 Abstract - ArogyaSutra: A Multi-Agent Framework for Multimodal Medical Reasoning in Indic Languages

Multimodal Large Language Models (MLLMs) have shown promising reasoning capabilities in general domains, yet their performance remains limited in specialized settings such as healthcare, especially in multilingual and low-resource scenarios. This gap is critical in regions like rural India, where patients often express complex medical queries in native Indic languages and rely on multimodal inputs such as medical images. Existing English-centric MLLMs struggle to support such use cases, limiting equitable access to AI-driven healthcare assistance. To address this challenge, we introduce ArogyaBodha, a large-scale multilingual multimodal medical question-answer dataset constructed from eight heterogeneous sources, covering 31 body systems, six imaging modalities, and 21 clinical domains across English and seven major Indian languages. We further propose ArogyaSutra, an actor-critic-based multi-agent framework that integrates tool grounding with dual-memory mechanisms for step-wise, reasoning-aware decision making, and uses stored actor-critic simulation trajectories for distillation. Experiments show that our dataset and framework improve multilingual medical reasoning accuracy across all Indic languages, with ablations validating the contribution of each component. The source code and dataset are available at: this https URL ArogyaSutra/

顶级标签: medical multi-modal multi-agents
详细标签: multilingual medical reasoning low-resource languages dataset question answering 或 搜索:

ArogyaSutra:面向印度语言多模态医学推理的多智能体框架 / ArogyaSutra: A Multi-Agent Framework for Multimodal Medical Reasoning in Indic Languages


1️⃣ 一句话总结

本文提出一个名为ArogyaSutra的多智能体框架,结合大规模多语言医学数据集,解决了当前AI在印度农村等低资源环境下无法有效处理患者用本地语言描述的医疗问题(如结合影像)的困境,从而提升多语言医疗推理的准确性。

源自 arXiv: 2606.13572