自监督深度ReLU网络中线性区域的复杂度研究 / Complexity of Linear Regions in Self-supervised Deep ReLU Networks
1️⃣ 一句话总结
本文通过分析自监督深度学习模型(如对比学习和自蒸馏方法)中线性区域的数量、面积、偏心率和边界变化,发现自监督模型比监督模型使用更少的线性区域就能达到相似精度,并且线性区域的几何特征可以作为衡量模型表示质量和检测表示崩溃的可靠指标。
There has been growing interest in studying the complexity of Rectified Linear Unit (ReLU) based activation networks. Recent work investigates the evolution of the number of piecewise-linear partitions (linear regions) that are formed during training. However, current research is limited to examining the complexity of models trained in a supervised way. Self-Supervised Learning (SSL) differs in that it directly optimises the representation space using a loss function to enhance the model's performance across multiple downstream tasks. This study investigates the local distribution of linear regions produced by SSL models. We demonstrate that the evolution of linear regions correlates with the representation quality by utilising SplineCam to extract two-dimensional polytopes near the data distribution. We track the number, area, eccentricity, and boundaries of regions throughout training. The study compares supervised, contrastive, and self-distillation methods over two standard benchmark datasets, MNIST and FashionMNIST. The analysis of the experimental results shows that self-supervised methods create substantially fewer regions to achieve comparable accuracy to supervised models. Contrastive methods rapidly expand regions over time, whereas self-distillation methods tend to consolidate by merging neighbouring regions. Lastly, we can detect representation collapse early within the geometric space of linear regions. Our analysis suggests that polytopal metrics can serve as reliable indicators of representation quality and model performance.
自监督深度ReLU网络中线性区域的复杂度研究 / Complexity of Linear Regions in Self-supervised Deep ReLU Networks
本文通过分析自监督深度学习模型(如对比学习和自蒸馏方法)中线性区域的数量、面积、偏心率和边界变化,发现自监督模型比监督模型使用更少的线性区域就能达到相似精度,并且线性区域的几何特征可以作为衡量模型表示质量和检测表示崩溃的可靠指标。
源自 arXiv: 2604.24393