面向软件漏洞检测的可泛化多模态表示学习 / Learning Generalizable Multimodal Representations for Software Vulnerability Detection
1️⃣ 一句话总结
本文提出了一种名为MultiVul的多模态对比学习框架,通过同时利用源代码和注释信息来增强漏洞检测能力,在多个大语言模型上相比传统方法最高提升了27%的F1分数。
Source code and its accompanying comments are complementary yet naturally aligned modalities-code encodes structural logic while comments capture developer intent. However, existing vulnerability detection methods mostly rely on single-modality code representations, overlooking the complementary semantic information embedded in comments and thus limiting their generalization across complex code structures and logical relationships. To address this, we propose MultiVul, a multimodal contrastive framework that aligns code and comment representations through dual similarity learning and consistency regularization, augmented with diverse code-text pairs to improve robustness. Experiments on widely adopted DiverseVul and Devign datasets across four large language models (LLMs) (i.e., DeepSeek-Coder-6.7B, Qwen2.5-Coder-7B, StarCoder2-7B, and CodeLlama-7B) show that MultiVul achieves up to 27.07% F1 improvement over prompting-based methods and 13.37% over code-only Fine-Tuning, while maintaining comparable inference efficiency.
面向软件漏洞检测的可泛化多模态表示学习 / Learning Generalizable Multimodal Representations for Software Vulnerability Detection
本文提出了一种名为MultiVul的多模态对比学习框架,通过同时利用源代码和注释信息来增强漏洞检测能力,在多个大语言模型上相比传统方法最高提升了27%的F1分数。
源自 arXiv: 2604.25711