菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-19
📄 Abstract - Block-Sphere Vector Quantization

Vector quantization is a fundamental primitive for scalable machine learning systems, enabling memory-efficient storage, fast retrieval, and compressed inference. Recent rotation-based quantizers such as EDEN, RabitQ, and TurboQuant have introduced strong guarantees and empirical performance, but the surrounding comparisons have been difficult to interpret because they rely on different distortion criteria, probability regimes, and implementation assumptions. As our first contribution, we provide a unified theoretical comparison of these methods and show that their relative advantages are criterion-dependent rather than absolute: EDEN and TurboQuant are favorable for MSE distortion, EDEN is also effective for expected inner-product distortion, and RabitQ provides strong high-probability control. This comparison further clarifies that EDEN provides particularly strong guarantees for expected distortion measures. As our second contribution, we introduce Block-Sphere Quantization (BlockQuant), a new rotation-based block quantization algorithm designed around the spherical geometry of randomly rotated vectors. Unlike coordinate-wise quantizers, BlockQuant quantizes blocks on the sphere, preserving the geometry of rotated embeddings more faithfully. We prove that this block-spherical design theoretically improves over the baselines considered in this paper for both reconstruction MSE and expected inner-product distortion. Our experiments on real embedding datasets and long-context LLM inference tasks show practical gains that are consistent with our theoretical improvements.

顶级标签: machine learning systems
详细标签: vector quantization rotation-based quantization embedding compression llm inference 或 搜索:

块球向量量化 / Block-Sphere Vector Quantization


1️⃣ 一句话总结

本文提出了一种基于球面几何的新式向量量化方法BlockQuant,相比现有旋转基量化器(如EDEN、RabitQ等)在重构误差和内积失真上均有理论提升,并在实际嵌入数据集与长文本大模型推理任务中验证了效果优势。

源自 arXiv: 2605.19972