菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-29
📄 Abstract - WMVLM: Evaluating Diffusion Model Image Watermarking via Vision-Language Models

Digital watermarking is essential for securing generated images from diffusion models. Accurate watermark evaluation is critical for algorithm development, yet existing methods have significant limitations: they lack a unified framework for both residual and semantic watermarks, provide results without interpretability, neglect comprehensive security considerations, and often use inappropriate metrics for semantic watermarks. To address these gaps, we propose WMVLM, the first unified and interpretable evaluation framework for diffusion model image watermarking via vision-language models (VLMs). We redefine quality and security metrics for each watermark type: residual watermarks are evaluated by artifact strength and erasure resistance, while semantic watermarks are assessed through latent distribution shifts. Moreover, we introduce a three-stage training strategy to progressively enable the model to achieve classification, scoring, and interpretable text generation. Experiments show WMVLM outperforms state-of-the-art VLMs with strong generalization across datasets, diffusion models, and watermarking methods.

顶级标签: model evaluation computer vision multi-modal
详细标签: watermark evaluation vision-language models diffusion models security metrics interpretability 或 搜索:

WMVLM:通过视觉语言模型评估扩散模型图像水印 / WMVLM: Evaluating Diffusion Model Image Watermarking via Vision-Language Models


1️⃣ 一句话总结

这篇论文提出了一个名为WMVLM的统一评估框架,它利用视觉语言模型来全面、可解释地评估扩散模型生成图像中的水印质量与安全性,解决了现有方法在评估不同类型水印时存在的诸多局限。

源自 arXiv: 2601.21610