多跳知识组合受限于预训练中的暴露程度 / Multi-Hop Knowledge Composition is Bound by Pretraining Exposure
1️⃣ 一句话总结
研究发现,大型语言模型在执行隐式多跳推理时(例如,结合“出生时间”和“好友关系”回答“好友的出生时间”)会失败,即使它能准确回答每个单跳问题,原因在于预训练期间模型必须直接接触过组合多个事实的上下文,否则即使单跳知识完整,也无法自主组合出新答案。
Large Language Models fail at implicit multi-hop reasoning: a model answers "When was $X$ born?" and "Who is $Y$'s closest friend?" correctly but fails on "When was $Y$'s closest friend born?" in a single forward pass, even when both facts are perfectly memorized and individually retrievable. We study this failure in a controlled natural language setting with a strict separation between individuals exposed to compositional contexts during pretraining and those that never appear in any such context. We confirm that compositional failure persists even at 97% 1-hop accuracy, establishing the gap as a pretraining failure rather than a knowledge absence. We propose and test nine data-centric augmentation formats and find that compositional pretraining transfers to unseen questions for exposed individuals, but never to individuals absent from compositional pretraining, suggesting that exposure to compositional contexts during pretraining is a necessary condition for implicit multi-hop reasoning.
多跳知识组合受限于预训练中的暴露程度 / Multi-Hop Knowledge Composition is Bound by Pretraining Exposure
研究发现,大型语言模型在执行隐式多跳推理时(例如,结合“出生时间”和“好友关系”回答“好友的出生时间”)会失败,即使它能准确回答每个单跳问题,原因在于预训练期间模型必须直接接触过组合多个事实的上下文,否则即使单跳知识完整,也无法自主组合出新答案。
源自 arXiv: 2606.09338