菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-16
📄 Abstract - Co-distilled attention guided masked image modeling with noisy teacher for self-supervised learning on medical images

Masked image modeling (MIM) is a highly effective self-supervised learning (SSL) approach to extract useful feature representations from unannotated data. Predominantly used random masking methods make SSL less effective for medical images due to the contextual similarity of neighboring patches, leading to information leakage and SSL simplification. Hierarchical shifted window (Swin) transformer, a highly effective approach for medical images cannot use advanced masking methods as it lacks a global [CLS] token. Hence, we introduced an attention guided masking mechanism for Swin within a co-distillation learning framework to selectively mask semantically co-occurring and discriminative patches, to reduce information leakage and increase the difficulty of SSL pretraining. However, attention guided masking inevitably reduces the diversity of attention heads, which negatively impacts downstream task performance. To address this, we for the first time, integrate a noisy teacher into the co-distillation framework (termed DAGMaN) that performs attentive masking while preserving high attention head diversity. We demonstrate the capability of DAGMaN on multiple tasks including full- and few-shot lung nodule classification, immunotherapy outcome prediction, tumor segmentation, and unsupervised organs clustering.

顶级标签: medical model training computer vision
详细标签: self-supervised learning masked image modeling attention guided masking co-distillation medical image analysis 或 搜索:

用于医学图像自监督学习的、带有噪声教师的协同蒸馏注意力引导掩码图像建模 / Co-distilled attention guided masked image modeling with noisy teacher for self-supervised learning on medical images


1️⃣ 一句话总结

这篇论文提出了一种名为DAGMaN的新自监督学习方法,它通过一个带有噪声教师的协同蒸馏框架,在医学图像上智能地选择并遮盖关键区域进行预训练,从而在减少信息泄露的同时保持了模型的学习多样性,最终在多种医学图像分析任务上取得了更好的效果。

源自 arXiv: 2604.14506