菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-12
📄 Abstract - All Circuits Lead to Rome: Rethinking Functional Anisotropy in Circuit and Sheaf Discovery for LLMs

In this paper, we present empirical and theoretical evidence against a central but largely implicit assumption in circuit and sheaf discovery (CSD), which we term the Functional Anisotropy Hypothesis: the idea that functions in large language models (LLMs) are localised to a unique or near-unique internal mechanism. We show that a single LLM task can instead be supported by multiple, structurally distinct circuits or sheaves that are simultaneously faithful, sparse, and complete. To systematically uncover such competing mechanisms, we introduce Overlap-Aware Sheaf Repulsion, a method that augments the CSD objective with an explicit penalty on structural overlap across multiple discovery runs, enabling the discovery of circuits or sheaves with strong task performance but minimal shared structure across a plethora of common CSD benchmarks. We find that this phenomenon becomes increasingly pronounced as the number of discovered sheaves grows and persists robustly across major CSD methods. We further identify an ultra-sparse three-edge sheaf and show that none of its edges is individually indispensable, undermining even weakened notions of canonical or essential components. To explain these findings, we propose a Distributive Dense Circuit Hypothesis and provide a theoretical analysis demonstrating that non-unique, low-overlap circuit explanations arise naturally from high-dimensional superposition under mild assumptions. Together, our results suggest that mechanistic explanations in LLMs are inherently non-canonical and call for a rethinking of how CSD results should be interpreted and evaluated.

顶级标签: llm machine learning model evaluation
详细标签: circuit discovery sheaf discovery functional anisotropy mechanistic interpretability high-dimensional superposition 或 搜索:

条条大路通罗马:重新思考大语言模型电路和层流发现中的功能各向异性 / All Circuits Lead to Rome: Rethinking Functional Anisotropy in Circuit and Sheaf Discovery for LLMs


1️⃣ 一句话总结

该论文通过实验和理论证明,大语言模型中同一任务可以由多种不同内部机制(电路或层流)实现,并非只有唯一固定的路径,并提出了重叠感知层流排斥方法,用于挖掘这些具有相同功能但结构差异大的机制。

源自 arXiv: 2605.12671