菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-01-03
📄 Abstract - KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs

While LLMs are powerful embedding backbones, their application in training-free settings faces two structural challenges: causal attention restricts early tokens from accessing subsequent context, and the next-token prediction objective biases representations toward generation rather than semantic compression. To address these limitations, we propose KV-Embedding, a framework that activates the latent representation power of frozen LLMs. Our method leverages the observation that the key-value (KV) states of the final token at each layer encode a compressed view of the sequence. By re-routing these states as a prepended prefix, we enable all tokens to access sequence-level context within a single forward pass. To ensure model-agnostic applicability, we introduce an automated layer selection strategy based on intrinsic dimensionality. Evaluations on MTEB across Qwen, Mistral, and Llama backbones show that KV-Embedding outperforms existing training-free baselines by up to 10%, while maintaining robust performance on sequences up to 4,096 tokens. These results demonstrate that internal state manipulation offers an efficient alternative to input modification, and we hope this work encourages further exploration of LLM internals for representation learning.

顶级标签: llm natural language processing model evaluation
详细标签: text embedding training-free key-value states representation learning internal state manipulation 或 搜索:

KV-嵌入:通过仅解码器大语言模型内部KV重路由实现免训练文本嵌入 / KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs


1️⃣ 一句话总结

这篇论文提出了一种名为KV-Embedding的新方法,它通过巧妙地重新组织大语言模型内部的关键-值状态,让未经额外训练的模型也能高效地生成高质量的文本语义表示,解决了传统方法在免训练场景下面临的上下文访问受限和语义压缩偏差两大难题。

源自 arXiv: 2601.01046