菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-26
📄 Abstract - Towards Multimodal Domain Generalization with Few Labels

Multimodal models ideally should generalize to unseen domains while remaining data-efficient to reduce annotation costs. To this end, we introduce and study a new problem, Semi-Supervised Multimodal Domain Generalization (SSMDG), which aims to learn robust multimodal models from multi-source data with few labeled samples. We observe that existing approaches fail to address this setting effectively: multimodal domain generalization methods cannot exploit unlabeled data, semi-supervised multimodal learning methods ignore domain shifts, and semi-supervised domain generalization methods are confined to single-modality inputs. To overcome these limitations, we propose a unified framework featuring three key components: Consensus-Driven Consistency Regularization, which obtains reliable pseudo-labels through confident fused-unimodal consensus; Disagreement-Aware Regularization, which effectively utilizes ambiguous non-consensus samples; and Cross-Modal Prototype Alignment, which enforces domain- and modality-invariant representations while promoting robustness under missing modalities via cross-modal translation. We further establish the first SSMDG benchmarks, on which our method consistently outperforms strong baselines in both standard and missing-modality scenarios. Our benchmarks and code are available at this https URL.

顶级标签: multi-modal model training machine learning
详细标签: domain generalization semi-supervised learning multimodal fusion pseudo-labeling robust representation 或 搜索:

迈向少标签的多模态领域泛化 / Towards Multimodal Domain Generalization with Few Labels


1️⃣ 一句话总结

这篇论文提出了一个名为‘半监督多模态领域泛化’的新问题及其解决方案,旨在利用少量标注数据和大量未标注数据,训练出能够适应新领域且对模态缺失鲁棒的多模态模型。

源自 arXiv: 2602.22917