菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-03
📄 Abstract - Quantization-Aware Regularizers for Deep Neural Networks Compression

Deep Neural Networks reached state-of-the-art performance across numerous domains, but this progress has come at the cost of increasingly large and over-parameterized models, posing serious challenges for deployment on resource-constrained devices. As a result, model compression has become essential, and -- among compression techniques -- weight quantization is largely used and particularly effective, yet it typically introduces a non-negligible accuracy drop. However, it is usually applied to already trained models, without influencing how the parameter space is explored during the learning phase. In contrast, we introduce per-layer regularization terms that drive weights to naturally form clusters during training, integrating quantization awareness directly into the optimization process. This reduces the accuracy loss typically associated with quantization methods while preserving their compression potential. Furthermore, in our framework quantization representatives become network parameters, marking, to the best of our knowledge, the first approach to embed quantization parameters directly into the backpropagation procedure. Experiments on CIFAR-10 with AlexNet and VGG16 models confirm the effectiveness of the proposed strategy.

顶级标签: model training machine learning systems
详细标签: neural network compression quantization regularization optimization parameter clustering 或 搜索:

用于深度神经网络压缩的量化感知正则化器 / Quantization-Aware Regularizers for Deep Neural Networks Compression


1️⃣ 一句话总结

这篇论文提出了一种在神经网络训练过程中直接加入量化感知正则化项的新方法,让网络权重在训练时就自动聚拢,从而在后续量化压缩时减少精度损失,实现更高效的模型压缩。

源自 arXiv: 2602.03614