菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-16
📄 Abstract - Beyond the Covariance Trap: Unlocking Generalization in Same-Subject Knowledge Editing for Large Language Models

While locate-then-edit knowledge editing efficiently updates knowledge encoded within Large Language Models (LLMs), a critical generalization failure mode emerges in the practical same-subject knowledge editing scenario: models fail to recall the updated knowledge when following user instructions, despite successfully recalling it in the original edited form. This paper identifies the geometric root of this generalization collapse as a fundamental conflict where the inner activation drifts induced by prompt variations exceed the model's geometric tolerance for generalization after editing. We attribute this instability to a dual pathology: (1) The joint optimization with orthogonal gradients collapses solutions into sharp minima with narrow stability, and (2) the standard covariance constraint paradoxically acts as a Covariance Trap that amplifies input perturbations. To resolve this, we introduce RoSE (Robust Same-subject Editing), which employs Isotropic Geometric Alignment to minimize representational deviation and Hierarchical Knowledge Integration to smooth the optimization landscape. Extensive experiments demonstrate that RoSE significantly improves instruction-following capabilities, laying the foundation for robust interactive parametric memory of LLM agents.

顶级标签: llm model training theory
详细标签: knowledge editing generalization model robustness activation geometry parametric memory 或 搜索:

超越协方差陷阱:解锁大语言模型同主题知识编辑中的泛化能力 / Beyond the Covariance Trap: Unlocking Generalization in Same-Subject Knowledge Editing for Large Language Models


1️⃣ 一句话总结

这篇论文发现并解决了一个大语言模型知识编辑的关键问题:模型在编辑后能记住新知识,但在用户指令引导下却会忘记;作者通过提出一种名为RoSE的新方法,有效提升了模型遵循指令时的知识回忆稳定性。

源自 arXiv: 2603.15518