Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning

📄 Abstract - Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning

Multimodal contrastive learning is increasingly enriched by going beyond image-text pairs. Among recent contrastive methods, Symile is a strong approach for this challenge because its multiplicative interaction objective captures higher-order cross-modal dependence. Yet, we find that Symile treats all modalities symmetrically and does not explicitly model reliability differences, a limitation that becomes especially present in trimodal multiplicative interactions. In practice, modalities beyond image-text pairs can be misaligned, weakly informative, or missing, and treating them uniformly can silently degrade performance. This fragility can be hidden in the multiplicative interaction: Symile may outperform pairwise CLIP even if a single unreliable modality silently corrupts the product terms. We propose Gated Symile, a contrastive gating mechanism that adapts modality contributions on an attention-based, per-candidate basis. The gate suppresses unreliable inputs by interpolating embeddings toward learnable neutral directions and incorporating an explicit NULL option when reliable cross-modal alignment is unlikely. Across a controlled synthetic benchmark that uncovers this fragility and three real-world trimodal datasets for which such failures could be masked by averages, Gated Symile achieves higher top-1 retrieval accuracy than well-tuned Symile and CLIP models. More broadly, our results highlight gating as a step toward robust multimodal contrastive learning under imperfect and more than two modalities.

隐藏于乘法交互之中：揭示多模态对比学习的脆弱性 / Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning

1️⃣ 一句话总结

这篇论文发现，当前先进的多模态对比学习方法Symile在处理超过两种模态（如图像、文本、音频）时，由于对所有模态一视同仁而存在隐藏的脆弱性，当某些模态信息不可靠时会暗中损害模型性能；为此，作者提出了一种带门控机制的改进方法Gated Symile，它能动态评估并调整每个模态的贡献度，从而在多种真实数据集上实现了更鲁棒和准确的检索性能。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要