菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-29
📄 Abstract - Multimodal Visual Surrogate Compression for Alzheimer's Disease Classification

High-dimensional structural MRI (sMRI) images are widely used for Alzheimer's Disease (AD) diagnosis. Most existing methods for sMRI representation learning rely on 3D architectures (e.g., 3D CNNs), slice-wise feature extraction with late aggregation, or apply training-free feature extractions using 2D foundation models (e.g., DINO). However, these three paradigms suffer from high computational cost, loss of cross-slice relations, and limited ability to extract discriminative features, respectively. To address these challenges, we propose Multimodal Visual Surrogate Compression (MVSC). It learns to compress and adapt large 3D sMRI volumes into compact 2D features, termed as visual surrogates, which are better aligned with frozen 2D foundation models to extract powerful representations for final AD classification. MVSC has two key components: a Volume Context Encoder that captures global cross-slice context under textual guidance, and an Adaptive Slice Fusion module that aggregates slice-level information in a text-enhanced, patch-wise manner. Extensive experiments on three large-scale Alzheimer's disease benchmarks demonstrate our MVSC performs favourably on both binary and multi-class classification tasks compared against state-of-the-art methods.

顶级标签: medical computer vision model training
详细标签: alzheimer's disease mri compression multimodal learning medical image classification foundation model adaptation 或 搜索:

用于阿尔茨海默病分类的多模态视觉代理压缩方法 / Multimodal Visual Surrogate Compression for Alzheimer's Disease Classification


1️⃣ 一句话总结

本文提出了一种名为MVSC的新方法,它通过文本引导将复杂的三维脑部核磁共振图像压缩成紧凑的二维‘视觉代理’,从而更高效地利用强大的二维预训练模型来提升阿尔茨海默病的诊断准确率。

源自 arXiv: 2601.21673