📄
Abstract - Sublinearly Structured Deep Neural Networks Achieve Feature Learning Consistency for Compositional Functions
Over the past decade, deep neural networks (DNNs) have achieved remarkable success on complex machine-learning tasks, yet the theoretical foundations of their performance remain incomplete. From a statistical viewpoint, a natural question is: can DNNs attain feature-learning and prediction consistency comparable to that of classical models? While a full characterization is open, we provide positive results for a broad subclass. We establish feature-learning consistency guarantees for sublinearly structured DNNs-architectures whose input/output dimensions and number of hidden neurons grow sublinearly with the sample size-when learning hierarchically compositional target functions. Importantly, this consistency still holds even in the conventional "over-parameterized" regime where the total number of parameters exceeds the number of training samples. Empirically, sublinearly structured DNNs match or surpass wide DNNs in prediction. A structural audit further indicates that widely used convolutional neural networks (CNNs), including AlexNet, VGGNet, ResNet, GoogLeNet, are sublinearly structured on their image classification benchmarks. We further prove that the sublinearly structured DNNs achieve universal approximation for hierarchically compositional functions in the large-sample limit. Moreover, images exhibit an inherent hierarchical, compositional structure. Taken together, these results explain, through a statistical lens, why many large-scale deep learning models succeed after adequate training on massive image datasets.
亚线性结构化深度神经网络实现组合函数的特征学习一致性 /
Sublinearly Structured Deep Neural Networks Achieve Feature Learning Consistency for Compositional Functions
1️⃣ 一句话总结
本文证明,对于具有层级组合结构的目标函数,若深度神经网络的输入/输出维度和隐藏神经元数量随样本量呈亚线性增长(即网络结构并非无限宽),则该网络不仅能保证特征学习和预测性能与经典统计模型相当,还能在超参数化(参数多于样本)情况下保持一致性,这解释了为何像AlexNet、VGGNet等广泛使用的CNN在图像任务上能取得优异表现。