规范不变的表征和乐 / Gauge-invariant representation holonomy
1️⃣ 一句话总结
这篇论文提出了一种名为‘表征和乐’的新方法,它通过测量神经网络内部特征在输入空间微小路径上的变化程度,来揭示传统相似性度量方法无法捕捉到的模型几何结构差异,并发现这种差异与模型的鲁棒性密切相关。
Deep networks learn internal representations whose geometry--how features bend, rotate, and evolve--affects both generalization and robustness. Existing similarity measures such as CKA or SVCCA capture pointwise overlap between activation sets, but miss how representations change along input paths. Two models may appear nearly identical under these metrics yet respond very differently to perturbations or adversarial stress. We introduce representation holonomy, a gauge-invariant statistic that measures this path dependence. Conceptually, holonomy quantifies the "twist" accumulated when features are parallel-transported around a small loop in input space: flat representations yield zero holonomy, while nonzero values reveal hidden curvature. Our estimator fixes gauge through global whitening, aligns neighborhoods using shared subspaces and rotation-only Procrustes, and embeds the result back to the full feature space. We prove invariance to orthogonal (and affine, post-whitening) transformations, establish a linear null for affine layers, and show that holonomy vanishes at small radii. Empirically, holonomy increases with loop radius, separates models that appear similar under CKA, and correlates with adversarial and corruption robustness. It also tracks training dynamics as features form and stabilize. Together, these results position representation holonomy as a practical and scalable diagnostic for probing the geometric structure of learned representations beyond pointwise similarity.
规范不变的表征和乐 / Gauge-invariant representation holonomy
这篇论文提出了一种名为‘表征和乐’的新方法,它通过测量神经网络内部特征在输入空间微小路径上的变化程度,来揭示传统相似性度量方法无法捕捉到的模型几何结构差异,并发现这种差异与模型的鲁棒性密切相关。
源自 arXiv: 2601.21653