菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-16
📄 Abstract - Rethinking LLM Watermark Detection in Black-Box Settings: A Non-Intrusive Third-Party Framework

While watermarking serves as a critical mechanism for LLM provenance, existing secret-key schemes tightly couple detection with injection, requiring access to keys or provider-side scheme-specific detectors for verification. This dependency creates a fundamental barrier for real-world governance, as independent auditing becomes impossible without compromising model security or relying on the opaque claims of service providers. To resolve this dilemma, we introduce TTP-Detect, a pioneering black-box framework designed for non-intrusive, third-party watermark verification. By decoupling detection from injection, TTP-Detect reframes verification as a relative hypothesis testing problem. It employs a proxy model to amplify watermark-relevant signals and a suite of complementary relative measurements to assess the alignment of the query text with watermarked distributions. Extensive experiments across representative watermarking schemes, datasets and models demonstrate that TTP-Detect achieves superior detection performance and robustness against diverse attacks.

顶级标签: llm systems model evaluation
详细标签: watermark detection black-box verification third-party auditing hypothesis testing security 或 搜索:

重新思考黑盒环境下的LLM水印检测:一种非侵入式的第三方框架 / Rethinking LLM Watermark Detection in Black-Box Settings: A Non-Intrusive Third-Party Framework


1️⃣ 一句话总结

这篇论文提出了一个名为TTP-Detect的创新框架,它允许第三方在不接触模型内部秘密或依赖服务商的情况下,独立检测大语言模型生成文本中的水印,从而解决了现有水印技术难以独立审计和监管的难题。

源自 arXiv: 2603.14968