菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-14
📄 Abstract - Learning with Shallow Neural Networks on Cluster-Structured Features

The success of deep learning in high-dimensional settings is often attributed to the presence of low-dimensional structure in real-world data. While standard theoretical models typically assume that this structure lies in the target function, projecting unstructured inputs onto a low-dimensional subspace, data such as images, text or genomic sequences exhibit strong spatial correlations within the input space itself. In this paper, we propose a tractable model to study how these correlations affect the sample complexity of learning with gradient descent on shallow neural networks. Specifically, we consider targets that depend on a small number of latent Boolean variables, and input features grouped into clusters and correlated with the latent variables. Under an identifiability assumption, we show that for a layerwise gradient-descent variant, the sample complexity scales with the number of hidden variables and, when the signal-to-noise ratio is sufficiently high, is independent of the input dimension, up to logarithmic terms. We empirically test our theoretical findings on both synthetic and real data.

顶级标签: machine learning theory
详细标签: shallow neural networks sample complexity gradient descent feature clustering high-dimensional learning 或 搜索:

基于聚类结构特征的浅层神经网络学习 / Learning with Shallow Neural Networks on Cluster-Structured Features


1️⃣ 一句话总结

本文提出一个可解释的模型,说明当输入特征存在聚类结构(如图像像素或基因序列)时,浅层神经网络通过梯度下降学习所需的数据量仅与潜在隐藏变量数量有关,而与输入维度无关,从而揭示了数据内部相关性如何显著降低学习复杂度。

源自 arXiv: 2605.14927