菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-25
📄 Abstract - Multi-modality Image Fusion under Adverse Weather: Mask-Guided Feature Restoration and Interaction

Multi-modality image fusion (MMIF) enhances scene representation by exploiting complementary cues from different modalities. Adverse weather, however, causes significant image degradation, disrupting feature representation and requiring simultaneous feature restoration and cross-modal complementarity. Existing methods often struggle with effective representation learning under such conditions, limiting their practical performance. To address these challenges, we propose a mask-guided MMIF method that integrates feature restoration and interaction. We first introduce "Pseudo Ground Truth" to simplify training, promoting faster and more effective feature learning. Then, we design a mask generation mechanism based on the mapping relationship between the fused result and the source images, quantifying the relative contribution of each modality during the fusion process. By incorporating the proposed mask-guided cross-modal cross-attention mechanism, the network is encouraged to selectively attend to informative features during modality interaction, mitigating the risk of overfitting to the static distribution of the "Pseudo Ground Truth". Additionally, we propose a mask-guided learning strategy and a task-coupled degradation-aware learning strategy to balance feature restoration and interaction. Extensive experiments on synthetic and real-world datasets demonstrate that our method surpasses state-of-the-art approaches in visual quality, quantitative metrics, and downstream tasks. The source code is available at this https URL.

顶级标签: computer vision multi-modal
详细标签: image fusion adverse weather mask-guided feature restoration cross-modal attention 或 搜索:

恶劣天气下的多模态图像融合:基于掩码引导的特征恢复与交互 / Multi-modality Image Fusion under Adverse Weather: Mask-Guided Feature Restoration and Interaction


1️⃣ 一句话总结

本文提出了一种在恶劣天气下融合不同传感器图像的方法,通过生成掩码来准确判断各图像贡献度,并引导模型自动修复退化特征、增强互补信息,从而显著提升融合图像的清晰度和实用效果。

源自 arXiv: 2606.26812