📄
Abstract - UniGame: Turning a Unified Multimodal Model Into Its Own Adversary
Unified Multimodal Models (UMMs) have shown impressive performance in both understanding and generation with a single architecture. However, UMMs still exhibit a fundamental inconsistency: understanding favors compact embeddings, whereas generation favors reconstruction-rich representations. This structural trade-off produces misaligned decision boundaries, degraded cross-modal coherence, and heightened vulnerability under distributional and adversarial shifts. In this paper, we present UniGame, a self-adversarial post-training framework that directly targets the inconsistencies. By applying a lightweight perturber at the shared token interface, UniGame enables the generation branch to actively seek and challenge fragile understanding, turning the model itself into its own adversary. Experiments demonstrate that UniGame significantly improves the consistency (+4.6%). Moreover, it also achieves substantial improvements in understanding (+3.6%), generation (+0.02), out-of-distribution and adversarial robustness (+4.8% and +6.2% on NaturalBench and AdVQA). The framework is architecture-agnostic, introduces less than 1% additional parameters, and is complementary to existing post-training methods. These results position adversarial self-play as a general and effective principle for enhancing the coherence, stability, and unified competence of future multimodal foundation models. The official code is available at: this https URL
UniGame:统一多模态模型的自对抗后训练框架 /
UniGame: Turning a Unified Multimodal Model Into Its Own Adversary
1️⃣ 一句话总结
本文提出了UniGame,首个针对统一多模态模型理解与生成路径间结构不一致问题的自对抗后训练框架,通过让生成分支主动挑战理解分支的脆弱性,显著提升模型一致性和鲁棒性。
2️⃣ 论文创新点
1. 自对抗训练框架
- 创新点:将统一多模态模型转化为自身对抗者,通过轻量级扰动器在共享视觉-标记接口创建有界、结构化扰动,让生成分支主动挑战理解分支的脆弱性
- 区别/改进:直接针对理解与生成路径间的结构不一致问题,而非使用替代目标
- 意义:显著提升模型一致性、性能及鲁棒性,框架架构无关且仅增加<1%参数
2. 解码器约束对抗扰动
- 创新点:通过模型原生解码器强制进行流形上的扰动,生成语义有效的对抗样本
- 区别/改进:相比嵌入空间扰动方法性能更优,在VQAv2上达到82.2%准确率
- 意义:生成更真实的对抗样本,有效扩展决策边界
3. 双路径训练机制
- 创新点:清洁路径保留原始语义信息并计算监督损失,生成路径通过扰动嵌入生成图像候选来挑战理解分支,并利用CLIP进行语义一致性检查
- 区别/改进:结合标准监督损失和对抗损失,平衡模型准确性和鲁棒性
- 意义:确保模型在干净数据和对抗样本上都能保持良好性能
3️⃣ 主要结果与价值
结果亮点
- 在理解任务上平均提升3.1%至3.6%,特别是在物体计数、空间关系、密集物体检测等具有挑战性的视觉推理任务上表现更佳
- 在UnifiedBench和WISE基准上的一致性和生成任务均优于基线模型和其他大型模型
- 仅需在RecA训练基础上增加5K步UniGame训练即可带来各项指标的一致提升
实际价值
- 计算效率高,不依赖特定架构,可作为现有后训练流程的补充
- 提供了一种有效的模型性能提升途径,能够以较低成本显著提升多模态模型性能
- 增强了模型在分布外和对抗攻击下的鲁棒性
4️⃣ 术语表
- UniGame:一种自对抗后训练框架,通过让生成分支主动挑战理解分支来改善统一多模态模型的一致性
- UMMs:统一多模态模型,使用单一架构同时处理理解和生成任务
- Perturber:紧凑网络模块,将融合视觉状态映射到扰动token,生成有界扰动用于创建对抗样本
- on-manifold adversarial examples:流形上的对抗样本,通过解码器约束生成,保持语义意义的对抗案例
- UnifiedBench:为UMM设计的基于重建的一致性评估基准,通过'描述-生成-比较'协议测量信息保持一致性
- WISE:世界知识感知的文本到图像基准,包含1000个知识密集型提示,评估生成图像与文本描述的一致性
- VQAv2:视觉问答基准数据集,用于评估模型性能
- CLIP:一种文本-图像语义匹配模型,在本工作中被用作约束,确保对抗样本的语义一致性
- RecA:一种现有的后训练方法,UniGame可与其互补使用