菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-02
📄 Abstract - The Reasoning-Creativity Trade-off: Toward Creativity-Driven Problem Solving

State-of-the-art large language model (LLM) pipelines rely on bootstrapped reasoning loops: sampling diverse chains of thought and reinforcing the highest-scoring ones, mainly optimizing correctness. We analyze how this design choice is sensitive to the collapse of the model's distribution over reasoning paths, slashing semantic entropy and undermining creative problem-solving. To analyze this failure, we introduce Distributional Creative Reasoning (DCR), a unified variational objective that casts training as gradient flow through probability measures on solution traces. STaR, GRPO, and DPO, as well as entropy bonuses, and other methods, all constitute special cases of the same loss. The framework delivers three core results: (i) the diversity decay theorem, describing how correctness-based objectives lead to distinct modes of diversity decay for STaR, GRPO, and DPO; (ii) designs that ensure convergence to a stable and diverse policy, effectively preventing collapse; and (iii) simple, actionable recipes to achieve this in practice. DCR thus offers the first principled recipe for LLMs that remain both correct and creative.

顶级标签: llm theory model training
详细标签: reasoning creativity distribution collapse variational objective diversity preservation 或 搜索:

推理与创造力的权衡:迈向创造力驱动的问题解决 / The Reasoning-Creativity Trade-off: Toward Creativity-Driven Problem Solving


1️⃣ 一句话总结

这篇论文指出当前大语言模型过度追求推理正确性会扼杀创造力,并提出一个名为DCR的统一理论框架,为如何让AI模型在保持准确性的同时也能进行创造性思考提供了首个系统性的解决方案。

源自 arXiv: 2601.00747