利用自监督学习预训练的深度学习模型进行蛋白质定位 / Using Deep Learning Models Pretrained by Self-Supervised Learning for Protein Localization
1️⃣ 一句话总结
这项研究表明,在大型显微图像数据集上通过自监督学习预训练的视觉模型,即使不经过额外调整,也能有效地迁移到不同实验条件下的小规模蛋白质定位任务中,显著提升模型性能。
Background: Task-specific microscopy datasets are often small, making it difficult to train deep learning models that learn robust features. While self-supervised learning (SSL) has shown promise through pretraining on large, domain-specific datasets, generalizability across datasets with differing staining protocols and channel configurations remains underexplored. We investigated the generalizability of SSL models pretrained on ImageNet-1k and HPA FOV, evaluating their embeddings on OpenCell with and without fine-tuning, two channel-mismatch strategies, and varying fine-tuning data fractions. We additionally analyzed single-cell embeddings on a labeled OpenCell subset. Result: DINO-based ViT backbones pretrained on HPA FOV or ImageNet-1k transfer well to OpenCell even without fine-tuning. The HPA FOV-pretrained model achieved the highest zero-shot performance (macro $F_1$ 0.822 $\pm$ 0.007). Fine-tuning further improved performance to 0.860 $\pm$ 0.013. At the single-cell level, the HPA single-cell-pretrained model achieved the highest k-nearest neighbor performance across all neighborhood sizes (macro $F_1$ $\geq$ 0.796). Conclusion: SSL methods like DINO, pretrained on large domain-relevant datasets, enable effective use of deep learning features for fine-tuning on small, task-specific microscopy datasets.
利用自监督学习预训练的深度学习模型进行蛋白质定位 / Using Deep Learning Models Pretrained by Self-Supervised Learning for Protein Localization
这项研究表明,在大型显微图像数据集上通过自监督学习预训练的视觉模型,即使不经过额外调整,也能有效地迁移到不同实验条件下的小规模蛋白质定位任务中,显著提升模型性能。
源自 arXiv: 2604.10970