比特网络文本嵌入 / BitNet Text Embeddings
1️⃣ 一句话总结
本文提出了一个名为BITEMBED的极低比特框架,能将基于大语言模型的文本嵌入模型转化为使用三值权重和量化激活的高效编码器,同时支持多种精度的输出嵌入,在大幅降低计算和存储成本的前提下,保持与原始全精度模型相当的性能。
LLM-based text embedders have substantially improved retrieval and semantic representation quality, but their deployment remains costly: large backbone models slow down embedding inference, while high-dimensional full-precision embeddings impose substantial storage and bandwidth overhead on large-scale indexes. In this paper, we present BITEMBED, an extreme low-bit framework for LLM-based text embedding that jointly targets encoding efficiency and vector storage. BITEMBED converts pretrained LLM backbones into BitNet-style embedding encoders with ternary weights, quantized activations, and lightweight normalization refinement. The converted model is adapted to representation learning through continual contrastive pre-training, followed by supervised contrastive fine-tuning with both similarity-distribution distillation and attention-relation distillation from a full-precision teacher. Beyond quantizing the backbone, BITEMBED further trains output embeddings to support multiple storage precisions meeting different storage needs in various scenarios. Experiments on MMTEB (eng, v2) with Qwen3-0.6B and Gemma3-270M show that BITEMBED is largely comparable to full precision teacher embedders. Moreover, BITEMBED flexibly obtains text embeddings of various precisions, achieving a trade-off between performance and storage cost.
比特网络文本嵌入 / BitNet Text Embeddings
本文提出了一个名为BITEMBED的极低比特框架,能将基于大语言模型的文本嵌入模型转化为使用三值权重和量化激活的高效编码器,同时支持多种精度的输出嵌入,在大幅降低计算和存储成本的前提下,保持与原始全精度模型相当的性能。
源自 arXiv: 2606.25674