论提示条件语言模型作为通用学习器的局限性 / On the Limits of Prompt-Conditioned Language Models as General-Purpose Learners
1️⃣ 一句话总结
本文通过形式化分析证明,由于语言作为交流通道的容量限制以及对齐约束的客观存在,单纯依赖提示的GPT类模型在面对某些任务族时,无论数据量多大、模型多大,其错误率都始终不低于一个正数下限,因此它们并非真正通用的问题求解器,而引入多模态信息或外部记忆则可能突破这一局限。
Large Language Models (LLMs) are frequently portrayed as general-purpose solvers capable of solving arbitrary tasks. We argue that this view overlooks a fundamental constraint: language is a compressed and capacity-limited interface for conveying task information. Modelling User--System interaction as a bilevel \emph{cheap-talk} game, we analyse how latent tasks are encoded into prompts and reinterpreted under alignment and safety constraints. We introduce a conceptual decomposition separating task inference from execution and derive PAC-Bayes bounds that distinguish finite-sample estimation error from irreducible structural limitations. Our first main result establishes an \emph{expressivity floor}: language acts as a capacity-limited communication channel, and whenever the informational complexity of a task family exceeds the capacity of that channel, distinct tasks become unavoidably indistinguishable to the Solver, inducing a strictly positive error floor that cannot be eliminated by additional data, optimisation, or model scaling alone. We then establish an \emph{objective-misalignment floor}: when alignment constraints restrict the admissible output set, the User-ideal distribution may lie outside the feasible class, inducing an irreducible distortion. Together, these results yield a formal negative conclusion: prompt-conditioned LLMs are not universal problem solvers through prompting alone, as there exist task families for which correct behaviour is provably unattainable even in the infinite-data regime. More broadly, our analysis shows the limits of prompt-based generalisation arise from information-constrained communication and alignment-constrained objectives. This suggests that interfaces beyond natural language, including multimodal observations and, external memory, may reduce the inherent LLM limitations by increasing the task-relevant information available to the System.
论提示条件语言模型作为通用学习器的局限性 / On the Limits of Prompt-Conditioned Language Models as General-Purpose Learners
本文通过形式化分析证明,由于语言作为交流通道的容量限制以及对齐约束的客观存在,单纯依赖提示的GPT类模型在面对某些任务族时,无论数据量多大、模型多大,其错误率都始终不低于一个正数下限,因此它们并非真正通用的问题求解器,而引入多模态信息或外部记忆则可能突破这一局限。
源自 arXiv: 2606.23668