StructBreak: Structural Cognitive Overload-Induced Safety Failures in MLLMs

📄 Abstract - StructBreak: Structural Cognitive Overload-Induced Safety Failures in MLLMs

Multimodal Large Language Models (MLLMs) excel at structural reasoning yet suffer from a sharp logical brittleness in structural consistency. We term this phenomenon Structural Cognitive Overload (SCO), a byproduct of the contention between deep reasoning and safety alignment. However, prior work has predominantly targeted typographic and pixel-level perturbations, leaving the study of SCO largely unexplored. To this end, we propose StructBreak, an automated end-to-end framework designed to quantify SCO. By leveraging StructBreak, we uncover a novel higher-order cognitive overload attack paradigm; notably, this attack operates under a practical black-box setting, requiring no internal model access. Consequently, we utilize this framework to establish a comprehensive benchmark spanning ten diverse threat scenarios. Empirical evaluations on six leading MLLMs reveal that SCO readily triggers toxic generation, yielding a 92% average ASR (up to 97% on Gemini 2.5). To elucidate the mechanism of SCO, we further conduct model-level interpretations spanning attention dynamics, latent space topology, and geometric analysis. Our findings reveal that StructBreak acts as a novel structural channel to circumvent safety filters. Furthermore, the limited efficacy of inherent safety mechanisms underscores that current alignment paradigms are insufficient for the era of complex multimodal reasoning.

StructBreak：多模态大模型中的结构性认知过载导致的安全失效 / StructBreak: Structural Cognitive Overload-Induced Safety Failures in MLLMs

1️⃣ 一句话总结

本文发现多模态大模型在处理复杂结构信息时，会因为深度推理与安全对齐之间的冲突而产生一种名为“结构性认知过载”的现象，并据此开发出StructBreak框架——一种能在黑盒条件下自动生成攻击策略的方法，实验表明该方法可使模型生成有害内容的成功率高达97%，揭示了现有安全机制在复杂多模态推理场景中的严重不足。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要