利用集成深度聚类方法挖掘电子健康记录以评估其有效性 / Mining Electronic Health Records to Investigate Effectiveness of Ensemble Deep Clustering
1️⃣ 一句话总结
本研究提出了一种新的集成深度聚类方法,通过整合多种数据维度的聚类结果,显著提升了从电子健康记录中识别心力衰竭患者亚型的效果,证明了结合传统与深度学习方法的优势。
In electronic health records (EHRs), clustering patients and distinguishing disease subtypes are key tasks to elucidate pathophysiology and aid clinical decision-making. However, clustering in healthcare informatics is still based on traditional methods, especially K-means, and has achieved limited success when applied to embedding representations learned by autoencoders as hybrid methods. This paper investigates the effectiveness of traditional, hybrid, and deep learning methods in heart failure patient cohorts using real EHR data from the All of Us Research Program. Traditional clustering methods perform robustly because deep learning approaches are specifically designed for image clustering, a task that differs substantially from the tabular EHR data setting. To address the shortcomings of deep clustering, we introduce an ensemble-based deep clustering approach that aggregates cluster assignments obtained from multiple embedding dimensions, rather than relying on a single fixed embedding space. When combined with traditional clustering in a novel ensemble framework, the proposed ensemble embedding for deep clustering delivers the best overall performance ranking across 14 diverse clustering methods and multiple patient cohorts. This paper underscores the importance of biological sex-specific clustering of EHR data and the advantages of combining traditional and deep clustering approaches over a single method.
利用集成深度聚类方法挖掘电子健康记录以评估其有效性 / Mining Electronic Health Records to Investigate Effectiveness of Ensemble Deep Clustering
本研究提出了一种新的集成深度聚类方法,通过整合多种数据维度的聚类结果,显著提升了从电子健康记录中识别心力衰竭患者亚型的效果,证明了结合传统与深度学习方法的优势。
源自 arXiv: 2604.07085