菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-13
📄 Abstract - MILM: Large Language Models for Multimodal Irregular Time Series with Informative Sampling

Multimodal irregular time series (MITS) consist of asynchronous and irregularly sampled observations from heterogeneous numerical and textual channels. In healthcare, for example, patients' electronic health records (EHR) include irregular lab measurements and clinical notes. The irregular timing and channel patterns of observations carry predictive signal alongside the numerical values and textual content. LLMs are natural candidates for processing such heterogeneous data, given their extensive pretrained knowledge spanning textual and numerical domains. We introduce MILM (Multimodal Irregular time series Language Model), which represents MITS as time-ordered triplets in Extensible Markup Language (XML) format and fine-tunes an LLM through a two-stage strategy for MITS classification. The first stage trains on value-redacted MITS to predict from sampling patterns alone, and the second stage trains on full MITS to jointly model sampling patterns and observed values. Our two-stage model (MILM-2S) and its single-stage counterpart (MILM-Direct) achieve the best and second-best average performance on multiple EHR datasets. Further value redaction evaluations confirm that sampling patterns carry predictive signal and that MILM-2S learns to exploit them. In the value pending evaluation we introduce, where some values are unavailable at prediction time, MILM-2S outperforms MILM-Direct by a larger margin compared to standard evaluation. For MILM-2S, preserving the time and channel of value-pending observations as additional sampling information further improves in-hospital mortality prediction.

顶级标签: llm medical multi-modal
详细标签: irregular time series electronic health records fine-tuning sampling patterns classification 或 搜索:

MILM:面向多模态不规则时间序列的大语言模型与信息采样方法 / MILM: Large Language Models for Multimodal Irregular Time Series with Informative Sampling


1️⃣ 一句话总结

本文提出MILM模型,通过将医疗等领域的多模态不规则时序数据(如数值和文本)转化为XML格式的时间有序三元组,并采用两阶段微调策略训练大语言模型,使其不仅能融合数据和文本信息,还能主动利用采样模式本身(如测量时间和频率)来提升预测性能,尤其在缺失值场景下优势更明显。

源自 arXiv: 2605.13711