📄
Abstract - ConceptTracer: Interactive Analysis of Concept Saliency and Selectivity in Neural Representations
Neural networks deliver impressive predictive performance across a variety of tasks, but they are often opaque in their decision-making processes. Despite a growing interest in mechanistic interpretability, tools for systematically exploring the representations learned by neural networks in general, and tabular foundation models in particular, remain limited. In this work, we introduce ConceptTracer, an interactive application for analyzing neural representations through the lens of human-interpretable concepts. ConceptTracer integrates two information-theoretic measures that quantify concept saliency and selectivity, enabling researchers and practitioners to identify neurons that respond strongly to individual concepts. We demonstrate the utility of ConceptTracer on representations learned by TabPFN and show that our approach facilitates the discovery of interpretable neurons. Together, these capabilities provide a practical framework for investigating how neural networks like TabPFN encode concept-level information. ConceptTracer is available at this https URL.
ConceptTracer:神经表征中概念显著性与选择性的交互式分析 /
ConceptTracer: Interactive Analysis of Concept Saliency and Selectivity in Neural Representations
1️⃣ 一句话总结
这篇论文介绍了一个名为ConceptTracer的交互式工具,它通过量化概念显著性和选择性,帮助研究者直观地分析和理解神经网络(特别是表格基础模型TabPFN)内部表征中哪些神经元对特定人类可理解概念有强烈响应,从而提升模型的可解释性。