菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-09
📄 Abstract - Beyond Surface Artifacts: Capturing Shared Latent Forgery Knowledge Across Modalities

As generative artificial intelligence evolves, deepfake attacks have escalated from single-modality manipulations to complex, multimodal threats. Existing forensic techniques face a severe generalization bottleneck: by relying excessively on superficial, modality-specific artifacts, they neglect the shared latent forgery knowledge hidden beneath variable physical appearances. Consequently, these models suffer catastrophic performance degradation when confronted with unseen "dark modalities." To break this limitation, this paper introduces a paradigm shift that redefines multimodal forensics from conventional "feature fusion" to "modality generalization." We propose the first modality-agnostic forgery (MAF) detection framework. By explicitly decoupling modality-specific styles, MAF precisely extracts the essential, cross-modal latent forgery knowledge. Furthermore, we define two progressive dimensions to quantify model generalization: transferability toward semantically correlated modalities (Weak MAF), and robustness against completely isolated signals of "dark modality" (Strong MAF). To rigorously assess these generalization limits, we introduce the DeepModal-Bench benchmark, which integrates diverse multimodal forgery detection algorithms and adapts state-of-the-art generalized learning methods. This study not only empirically proves the existence of universal forgery traces but also achieves significant performance breakthroughs on unknown modalities via the MAF framework, offering a pioneering technical pathway for universal multimodal defense.

顶级标签: multi-modal model evaluation computer vision
详细标签: deepfake detection modality generalization forensics benchmark cross-modal learning 或 搜索:

超越表面痕迹:捕获跨模态的共享潜在伪造知识 / Beyond Surface Artifacts: Capturing Shared Latent Forgery Knowledge Across Modalities


1️⃣ 一句话总结

这篇论文提出了一个名为MAF的、不依赖具体模态的伪造检测新框架,它通过剥离不同模态(如图像、音频)的表面特征,提取出所有伪造内容共有的、深层的‘伪造痕迹’,从而能有效识别出未知类型的深度伪造攻击,解决了现有检测技术泛化能力差的问题。

源自 arXiv: 2604.07763