基于西森温度的扩散引导特征选择:噪声谱嵌入 / Diffusion-Guided Feature Selection via Nishimori Temperature: Noise-Based Spectral Embedding
1️⃣ 一句话总结
本文提出了一种名为噪声谱嵌入(NBSE)的新方法,利用物理中的“西森温度”概念,通过分析数据中的扩散过程来自动识别并去除冗余特征,从而在不降低分类准确率的前提下大幅压缩特征数量,在ImageNet实验中仅用30%的特征就几乎保持了原有性能。
We propose Noise-Based Spectral Embedding (NBSE), a physics-informed framework for selecting informative features from high-dimensional data without greedy search. NBSE constructs a sparse similarity graph on the samples and identifies the Nishimori temperature $\beta_N$ the critical inverse temperature at which the Bethe Hessian becomes singular. The corresponding smallest eigenvector captures the dominant mode of an intrinsically degree-corrected diffusion process, naturally reweighting nodes to prevent hub dominance. By transposing the data matrix and applying NBSE in feature space, we obtain a one-dimensional spectral embedding that reveals groups of redundant or semantically related dimensions; balanced binning then selects one representative per group. We prove that coloured Gaussian perturbations shift $\beta_N$ by at most $O(\bar\sigma^2)$, guaranteeing robustness to measurement noise. Experiments on ImageNet embeddings from MobileNetV2 and EfficientNet-B4 show that NBSE preserves classification accuracy even under aggressive compression: on EfficientNet-B4 the accuracy drop is below $1\%$ when retaining only $30\%$ of features, outperforming ANOVA $F$-test and random selection by up to $6.8\%$.
基于西森温度的扩散引导特征选择:噪声谱嵌入 / Diffusion-Guided Feature Selection via Nishimori Temperature: Noise-Based Spectral Embedding
本文提出了一种名为噪声谱嵌入(NBSE)的新方法,利用物理中的“西森温度”概念,通过分析数据中的扩散过程来自动识别并去除冗余特征,从而在不降低分类准确率的前提下大幅压缩特征数量,在ImageNet实验中仅用30%的特征就几乎保持了原有性能。
源自 arXiv: 2604.24692