← 返回列表

arXiv 提交日期: 2026-03-05

📄 Abstract - WaterSIC: information-theoretically (near) optimal linear layer quantization

This paper considers the problem of converting a given dense linear layer to low precision. The tradeoff between compressed length and output discrepancy is analyzed information theoretically (IT). It is shown that a popular GPTQ algorithm may have an arbitrarily large gap to the IT limit. To alleviate this problem, a novel algorithm, termed ''WaterSIC'', is proposed and is shown to be within a rate gap of 0.255 bits to the IT limit, uniformly over all possible covariance matrices of input activations. The key innovation of WaterSIC's is to allocate different quantization rates to different columns (in-features) of the weight matrix, mimicking the classical IT solution known as ''waterfilling''. Applying WaterSIC to the Llama and Qwen family of LLMs establishes new state-of-the-art performance for all quantization rates from 1 to 4 bits.

顶级标签: model training machine learning theory

WaterSIC：一种信息论上（接近）最优的线性层量化方法 / WaterSIC: information-theoretically (near) optimal linear layer quantization

1️⃣ 一句话总结

本文提出了一种名为WaterSIC的新算法，它通过为神经网络线性层权重矩阵的不同列分配不同的量化比特数，在信息论上实现了接近最优的模型压缩，显著提升了大型语言模型在1到4比特低精度量化下的性能。

👋 没兴趣 ☆ 感兴趣 📌 待读

打开原文 PDF

源自 arXiv: 2603.04956

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要