菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-01
📄 Abstract - Knowledge without Wisdom: Measuring Misalignment between LLMs and Intended Impact

LLMs increasingly excel on AI benchmarks, but doing so does not guarantee validity for downstream tasks. This study evaluates the performance of leading foundation models (FMs, i.e., generative pre-trained base LLMs) with out-of-distribution (OOD) tasks of the teaching and learning of schoolchildren. Across all FMs, inter-model behaviors on disparate tasks correlate higher than they do with expert human behaviors on target tasks. These biases shared across LLMs are poorly aligned with downstream measures of teaching quality and often \textit{negatively aligned with learning outcomes}. Further, we find multi-model ensembles, both unanimous model voting and expert-weighting by benchmark performance, further exacerbate misalignment with learning. We measure that 50\% of the variation in misalignment error is shared across foundation models, suggesting that common pretraining accounts for much of the misalignment in these tasks. We demonstrate methods for robustly measuring alignment of complex tasks and provide unique insights into both educational applications of foundation models and to understanding limitations of models.

顶级标签: llm model evaluation benchmark
详细标签: alignment out-of-distribution educational ai model bias downstream performance 或 搜索:

有知识而无智慧:衡量大语言模型与预期影响之间的错位 / Knowledge without Wisdom: Measuring Misalignment between LLMs and Intended Impact


1️⃣ 一句话总结

这篇研究发现,尽管大语言模型在标准测试上表现出色,但在教育儿童等实际任务中,它们的行为与人类专家的期望存在系统性偏差,甚至可能对学习效果产生负面影响,而这种偏差主要源于模型预训练阶段的共同缺陷。

源自 arXiv: 2603.00883