菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-24
📄 Abstract - Improving LLM Predictions via Inter-Layer Structural Encoders

The standard practice in Large Language Models (LLMs) is to base predictions on the final-layer token representations. Recent studies, however, show that intermediate layers encode substantial information, which may contain more task-relevant features than the final-layer representations alone. Importantly, it was shown that for different tasks, different layers may be optimal. In this work we introduce Inter-Layer Structural Encoders (ILSE), a powerful structural approach to learn one effective representation from the LLM's internal layer representations all together. Central to ILSE is Cayley-Encoder, a mathematically grounded geometric encoder that leverages expander Cayley graphs for efficient inter-layer information propagation. We evaluate ILSE across 13 classification and semantic similarity tasks with 9 pre-trained LLMs ranging from 14 million to 8 billion parameters. ILSE consistently outperforms baselines and existing approaches, achieving up to 44% improvement in accuracy and 25% in similarity metrics. We further show that ILSE is data-efficient in few-shot regimes and can make small LLMs competitive with substantially larger models.

顶级标签: llm model training model evaluation
详细标签: layer aggregation geometric encoding cayley graphs representation learning few-shot learning 或 搜索:

通过层间结构编码器改进大语言模型的预测能力 / Improving LLM Predictions via Inter-Layer Structural Encoders


1️⃣ 一句话总结

这篇论文提出了一种名为ILSE的新方法,它通过一个基于数学图论的编码器,巧妙地整合大语言模型中间各层的有用信息,从而让模型在各种任务上的预测更准确,甚至能让小模型媲美大模型的表现。

源自 arXiv: 2603.22665