菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-23
📄 Abstract - Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy

Large language models can rewrite text to embed hidden payloads while preserving surface-level meaning, a capability that opens covert channels between cooperating AI systems and poses challenges for alignment monitoring. We study the information-theoretic cost of such embedding. Our main result is that any steganographic scheme that preserves the semantic load of a covertext~$M_1$ while encoding a payload~$P$ into a stegotext~$M_2$ must satisfy $K(M_2) \geq K(M_1) + K(P) - O(\log n)$, where $K$ denotes Kolmogorov complexity and $n$ is the combined message length. A corollary is that any non-trivial payload forces a strict complexity increase in the stegotext, regardless of how cleverly the encoder distributes the signal. Because Kolmogorov complexity is uncomputable, we ask whether practical proxies can detect this predicted increase. Drawing on the classical correspondence between lossless compression and Kolmogorov complexity, we argue that language-model perplexity occupies an analogous role in the probabilistic regime and propose the Binoculars perplexity-ratio score as one such proxy. Preliminary experiments with a color-based LLM steganographic scheme support the theoretical prediction: a paired $t$-test over 300 samples yields $t = 5.11$, $p < 10^{-6}$.

顶级标签: llm theory natural language processing
详细标签: steganography kolmogorov complexity perplexity detection information theory 或 搜索:

大语言模型隐写术的柯氏复杂度边界及一种基于困惑度的检测代理方法 / Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy


1️⃣ 一句话总结

这篇论文从信息论角度证明,大语言模型在文本中隐藏信息必然会导致文本复杂度增加,并提出了一种基于模型困惑度的实用方法来检测这种隐藏信息。

源自 arXiv: 2603.21567