压缩知识图谱假说:哪些图事实对科学假设生成真正重要? / The Compressive Knowledge Graph Hypothesis: Which Graph Facts Matter for Scientific Hypothesis Generation?
1️⃣ 一句话总结
本文通过研究不同语言模型在电池材料假设生成任务中的表现,发现知识图谱中的有用信息往往可以从经过科学结构精简的紧凑子图中获取,而无需依赖完整的局部图谱,从而提出了“压缩知识图谱假说”。
Knowledge graphs (KGs) can provide structured scientific context to language models, but it remains unclear which graph facts actually shape the generated hypotheses. We study KG-guided hypothesis generation for battery materials across Mistral-7B, Llama-3.1-70B, and Gemini 2.5 Flash. We perturb local KGs by varying density, ontology richness, topology, and control structure, and evaluate outputs with both provided-graph and fixed-reference metrics. Across models, KG utility is selective and model-dependent: graph context changes outputs, but no-KG outputs also recover substantial graph content from model priors. Compact top-k subgraphs often approximate full-KG behavior, including when claimed-outcome triples are held out. At the same time, compression is not unique to one semantic ranking rule, random and topology-based subsets can also recover much of the signal. These results support a redundancy-aware Compressive KG hypothesis: useful KG signal is often recoverable from compact, scientifically structured subgraphs rather than requiring the full local graph.
压缩知识图谱假说:哪些图事实对科学假设生成真正重要? / The Compressive Knowledge Graph Hypothesis: Which Graph Facts Matter for Scientific Hypothesis Generation?
本文通过研究不同语言模型在电池材料假设生成任务中的表现,发现知识图谱中的有用信息往往可以从经过科学结构精简的紧凑子图中获取,而无需依赖完整的局部图谱,从而提出了“压缩知识图谱假说”。
源自 arXiv: 2605.27176