菜单

🤖 系统
📄 Abstract - UniGame: Turning a Unified Multimodal Model Into Its Own Adversary

Unified Multimodal Models (UMMs) have shown impressive performance in both understanding and generation with a single architecture. However, UMMs still exhibit a fundamental inconsistency: understanding favors compact embeddings, whereas generation favors reconstruction-rich representations. This structural trade-off produces misaligned decision boundaries, degraded cross-modal coherence, and heightened vulnerability under distributional and adversarial shifts. In this paper, we present UniGame, a self-adversarial post-training framework that directly targets the inconsistencies. By applying a lightweight perturber at the shared token interface, UniGame enables the generation branch to actively seek and challenge fragile understanding, turning the model itself into its own adversary. Experiments demonstrate that UniGame significantly improves the consistency (+4.6%). Moreover, it also achieves substantial improvements in understanding (+3.6%), generation (+0.02), out-of-distribution and adversarial robustness (+4.8% and +6.2% on NaturalBench and AdVQA). The framework is architecture-agnostic, introduces less than 1% additional parameters, and is complementary to existing post-training methods. These results position adversarial self-play as a general and effective principle for enhancing the coherence, stability, and unified competence of future multimodal foundation models. The official code is available at: this https URL

顶级标签: multi-modal model training aigc
详细标签: adversarial training multimodal consistency unified models robustness self-adversarial 或 搜索:

UniGame:统一多模态模型的自对抗后训练框架 / UniGame: Turning a Unified Multimodal Model Into Its Own Adversary


1️⃣ 一句话总结

本文提出了UniGame,首个针对统一多模态模型理解与生成路径间结构不一致问题的自对抗后训练框架,通过让生成分支主动挑战理解分支的脆弱性,显著提升模型一致性和鲁棒性。


2️⃣ 论文创新点

1. 自对抗训练框架

2. 解码器约束对抗扰动

3. 双路径训练机制


3️⃣ 主要结果与价值

结果亮点

实际价值


4️⃣ 术语表

📄 打开原文 PDF