SCAN:用于终身知识编辑的稀疏电路锚定可解释神经元 / SCAN: Sparse Circuit Anchor Interpretable Neuron for Lifelong Knowledge Editing
1️⃣ 一句话总结
这篇论文提出了一个名为SCAN的稀疏编辑框架,它通过定位并修改大语言模型中与特定知识相关的少量关键神经元(即“知识电路”),来解决传统密集编辑方法导致的模型知识遗忘和崩溃问题,从而在连续进行数千次知识更新后,依然能保持模型的整体性能。
Large Language Models (LLMs) often suffer from catastrophic forgetting and collapse during sequential knowledge editing. This vulnerability stems from the prevailing dense editing paradigm, which treats models as black boxes and relies on coarse-grained parameter interventions that inevitably disrupt preserved knowledge. To address this, we propose SCAN (a sparse editing framework based on Sparse Circuit Anchored Neuron) which transforms editing into a mechanism-aware manipulation by constructing a knowledge circuit via Sparse Transcoders. Experiments on Gemma2, Qwen3, and Llama3.1 across CounterFact, ZsRE and WikiFactDiff demonstrate that SCAN achieves a superior performance, maintaining model integrity on benchmarks like MMLU and GSM8K even after 3,000 sequential edits, whereas other existing methods deteriorate progressively as editing accumulates, eventually resulting in model collapse.
SCAN:用于终身知识编辑的稀疏电路锚定可解释神经元 / SCAN: Sparse Circuit Anchor Interpretable Neuron for Lifelong Knowledge Editing
这篇论文提出了一个名为SCAN的稀疏编辑框架,它通过定位并修改大语言模型中与特定知识相关的少量关键神经元(即“知识电路”),来解决传统密集编辑方法导致的模型知识遗忘和崩溃问题,从而在连续进行数千次知识更新后,依然能保持模型的整体性能。
源自 arXiv: 2603.15226