面向自动驾驶的视觉语言模型多模态后门攻击:通过涂鸦与跨语言触发器 / Multimodal Backdoor Attack on VLMs for Autonomous Driving via Graffiti and Cross-Lingual Triggers
1️⃣ 一句话总结
这篇论文提出了一种针对自动驾驶视觉语言模型的新型隐蔽后门攻击方法,通过在城市场景中自然融合的涂鸦视觉触发器和跨语言文本触发器,能在不降低模型正常性能的情况下实现高成功率攻击,揭示了自动驾驶系统面临的新安全威胁。
Visual language model (VLM) is rapidly being integrated into safety-critical systems such as autonomous driving, making it an important attack surface for potential backdoor attacks. Existing backdoor attacks mainly rely on unimodal, explicit, and easily detectable triggers, making it difficult to construct both covert and stable attack channels in autonomous driving scenarios. GLA introduces two naturalistic triggers: graffiti-based visual patterns generated via stable diffusion inpainting, which seamlessly blend into urban scenes, and cross-language text triggers, which introduce distributional shifts while maintaining semantic consistency to build robust language-side trigger signals. Experiments on DriveVLM show that GLA requires only a 10\% poisoning ratio to achieve a 90\% Attack Success Rate (ASR) and a 0\% False Positive Rate (FPR). More insidiously, the backdoor does not weaken the model on clean tasks, but instead improves metrics such as BLEU-1, making it difficult for traditional performance-degradation-based detection methods to identify the attack. This study reveals underestimated security threats in self-driving VLMs and provides a new attack paradigm for backdoor evaluation in safety-critical multimodal systems.
面向自动驾驶的视觉语言模型多模态后门攻击:通过涂鸦与跨语言触发器 / Multimodal Backdoor Attack on VLMs for Autonomous Driving via Graffiti and Cross-Lingual Triggers
这篇论文提出了一种针对自动驾驶视觉语言模型的新型隐蔽后门攻击方法,通过在城市场景中自然融合的涂鸦视觉触发器和跨语言文本触发器,能在不降低模型正常性能的情况下实现高成功率攻击,揭示了自动驾驶系统面临的新安全威胁。
源自 arXiv: 2604.04630