菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-24
📄 Abstract - CG-DMER: Hybrid Contrastive-Generative Framework for Disentangled Multimodal ECG Representation Learning

Accurate interpretation of electrocardiogram (ECG) signals is crucial for diagnosing cardiovascular diseases. Recent multimodal approaches that integrate ECGs with accompanying clinical reports show strong potential, but they still face two main concerns from a modality perspective: (1) intra-modality: existing models process ECGs in a lead-agnostic manner, overlooking spatial-temporal dependencies across leads, which restricts their effectiveness in modeling fine-grained diagnostic patterns; (2) inter-modality: existing methods directly align ECG signals with clinical reports, introducing modality-specific biases due to the free-text nature of the reports. In light of these two issues, we propose CG-DMER, a contrastive-generative framework for disentangled multimodal ECG representation learning, powered by two key designs: (1) Spatial-temporal masked modeling is designed to better capture fine-grained temporal dynamics and inter-lead spatial dependencies by applying masking across both spatial and temporal dimensions and reconstructing the missing information. (2) A representation disentanglement and alignment strategy is designed to mitigate unnecessary noise and modality-specific biases by introducing modality-specific and modality-shared encoders, ensuring a clearer separation between modality-invariant and modality-specific representations. Experiments on three public datasets demonstrate that CG-DMER achieves state-of-the-art performance across diverse downstream tasks.

顶级标签: medical multi-modal model training
详细标签: ecg analysis representation learning contrastive learning generative modeling multimodal fusion 或 搜索:

CG-DMER:用于解耦多模态心电图表征学习的混合对比-生成框架 / CG-DMER: Hybrid Contrastive-Generative Framework for Disentangled Multimodal ECG Representation Learning


1️⃣ 一句话总结

这篇论文提出了一个名为CG-DMER的新框架,它通过结合对比学习和生成学习,并引入时空掩码建模与表征解耦对齐策略,有效解决了现有方法在心电图与临床报告多模态融合中忽略导联间时空依赖性和引入文本特有偏差的问题,从而在多种下游任务上取得了领先性能。

源自 arXiv: 2602.21154