菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-25
📄 Abstract - MedAidDialog: A Multilingual Multi-Turn Medical Dialogue Dataset for Accessible Healthcare

Conversational artificial intelligence has the potential to assist users in preliminary medical consultations, particularly in settings where access to healthcare professionals is limited. However, many existing medical dialogue systems operate in a single-turn question--answering paradigm or rely on template-based datasets, limiting conversational realism and multilingual applicability. In this work, we introduce MedAidDialog, a multilingual multi-turn medical dialogue dataset designed to simulate realistic physician--patient consultations. The dataset extends the MDDial corpus by generating synthetic consultations using large language models and further expands them into a parallel multilingual corpus covering seven languages: English, Hindi, Telugu, Tamil, Bengali, Marathi, and Arabic. Building on this dataset, we develop MedAidLM, a conversational medical model trained using parameter-efficient fine-tuning on quantized small language models, enabling deployment without high-end computational infrastructure. Our framework additionally incorporates optional patient pre-context information (e.g., age, gender, allergies) to personalize the consultation process. Experimental results demonstrate that the proposed system can effectively perform symptom elicitation through multi-turn dialogue and generate diagnostic recommendations. We further conduct medical expert evaluation to assess the plausibility and coherence of the generated consultations.

顶级标签: medical llm natural language processing
详细标签: medical dialogue dataset multilingual synthetic data generation parameter-efficient fine-tuning healthcare accessibility 或 搜索:

MedAidDialog:一个用于普惠医疗的多语言多轮医疗对话数据集 / MedAidDialog: A Multilingual Multi-Turn Medical Dialogue Dataset for Accessible Healthcare


1️⃣ 一句话总结

这篇论文创建了一个覆盖七种语言的多轮真实医疗对话数据集,并基于此开发了一个能在普通设备上运行的轻量级AI医疗助手,旨在通过模拟医患问诊来帮助医疗资源匮乏地区的用户进行初步病情咨询。

源自 arXiv: 2603.24132