基于柯氏复杂度的AI安全验证的不完备性 / Incompleteness of AI Safety Verification via Kolmogorov Complexity
1️⃣ 一句话总结
这篇论文从信息论角度证明,任何有限的、形式化的AI安全验证器都存在一个根本性局限:当AI系统的行为复杂度超过某个阈值时,验证器将无法证明所有符合安全策略的实例,这揭示了安全验证存在独立于计算资源的固有极限。
Ensuring that artificial intelligence (AI) systems satisfy formal safety and policy constraints is a central challenge in safety-critical domains. While limitations of verification are often attributed to combinatorial complexity and model expressiveness, we show that they arise from intrinsic information-theoretic limits. We formalize policy compliance as a verification problem over encoded system behaviors and analyze it using Kolmogorov complexity. We prove an incompleteness result: for any fixed sound computably enumerable verifier, there exists a threshold beyond which true policy-compliant instances cannot be certified once their complexity exceeds that threshold. Consequently, no finite formal verifier can certify all policy-compliant instances of arbitrarily high complexity. This reveals a fundamental limitation of AI safety verification independent of computational resources, and motivates proof-carrying approaches that provide instance-level correctness guarantees.
基于柯氏复杂度的AI安全验证的不完备性 / Incompleteness of AI Safety Verification via Kolmogorov Complexity
这篇论文从信息论角度证明,任何有限的、形式化的AI安全验证器都存在一个根本性局限:当AI系统的行为复杂度超过某个阈值时,验证器将无法证明所有符合安全策略的实例,这揭示了安全验证存在独立于计算资源的固有极限。
源自 arXiv: 2604.04876