TooBad:基于超低投毒率和不可察觉触发器的后门扩散模型 / TooBad: Backdoor Diffusion Models with Ultra-Low Poison Rate and Imperceptible Trigger
1️⃣ 一句话总结
本文提出了一种针对扩散模型的新型后门攻击框架TooBad,通过优化触发器设计,仅需极低比例(0.5%)的恶意训练数据,就能在不影响模型正常生成能力的前提下,高效植入后门,且能轻松绕过现有防御手段,揭示了扩散模型面临的新型安全威胁。
Diffusion models (DMs), despite their impressive capabilities across a wide range of generative tasks, have been shown to be vulnerable to backdoor attacks. However, existing backdoor methods face critical trade-offs among key factors: attack performance, stealthiness, time complexity, and required poison rates. For example, achieving high attack performance typically demands a high poison rate and prolonged training, which undermines stealthiness, making the attack more detectable by backdoor defenses. This paper proposes TooBad (trigger optimization for backdoor diffusion models), a backdoor framework which introduces a novel DM-tailored trigger optimization technique to dramatically enhance the performance of backdoor attacks on DMs. Experiments on representative benchmarks such as CIFAR-10 show that TooBad can achieve high ASRs ($> 85$%) at only 0.5% poison rate, significantly lower than the 10% typically required by prior work on the same datasets. At 5% poison rate, TooBad reaches nearly 100% ASR within just 3-5 backdoor injection epochs, whereas existing methods need at least 30-50 epochs at double the poison rate for comparable results. Despite its potency, TooBad easily evades SOTA defenses and maintains high utility. These results reveal a critical threat on DMs and highlight the need for more robust defenses against such stealthy yet efficient attacks.
TooBad:基于超低投毒率和不可察觉触发器的后门扩散模型 / TooBad: Backdoor Diffusion Models with Ultra-Low Poison Rate and Imperceptible Trigger
本文提出了一种针对扩散模型的新型后门攻击框架TooBad,通过优化触发器设计,仅需极低比例(0.5%)的恶意训练数据,就能在不影响模型正常生成能力的前提下,高效植入后门,且能轻松绕过现有防御手段,揭示了扩散模型面临的新型安全威胁。
源自 arXiv: 2606.23362