菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-24
📄 Abstract - Pre-Warm: Input-Conditioned Weight Initialization for Convolutional Neural Networks

We introduce Pre-Warm, a simple yet effective zero-training-cost method for data-conditioned initialization of the first convolutional layer. Before the first forward pass, Pre-Warm extracts mean-centered local patches from a single training batch, clusters them with MiniBatchKMeans, applies inverse Manhattan spatial weighting, and uses the resulting centroids to initialize half of the first-layer filters (the remainder retain Kaiming initialization). We derive closed-form rules for all hyperparameters except a single insensitive scale parameter, though we derive a Kaiming parity bound on scale from patch dimensionality. For grayscale datasets we use Otsu's foreground density; for natural color images we use the mean L2 norm of mean-centered patches. Both rules accurately predict the optimal patch count observed in grid search. Across five standard benchmarks -- MNIST, Fashion-MNIST, CIFAR-10, SVHN, and CIFAR-100 -- and 8-seed paired experiments, Pre-Warm yields statistically significant accuracy improvements over standard Kaiming initialization (p < 0.05 on all datasets, p = 0.0007 on SVHN with 8/8 wins, p = 0.0033 on CIFAR-100 with 7/8 wins). The method adds negligible overhead, requires no architectural changes, and integrates into existing training pipelines with only a few lines of code. Pre-Warm demonstrates that even a lightweight, input-dependent signal can meaningfully improve optimization trajectories in modern convolutional networks.

顶级标签: machine learning computer vision
详细标签: weight initialization convolutional neural networks data-conditioned k-means clustering 或 搜索:

预热:基于输入条件的卷积神经网络权重初始化方法 / Pre-Warm: Input-Conditioned Weight Initialization for Convolutional Neural Networks


1️⃣ 一句话总结

本文提出一种名为Pre-Warm的轻量级方法,通过从单个训练批次中提取图像局部区域并进行聚类,用聚类中心来初始化卷积神经网络第一层的部分滤波器权重,从而在不增加训练成本的前提下显著提升模型在多个图像分类任务上的准确率。

源自 arXiv: 2606.25256