Improving LLM Predictions via Inter-Layer Structural Encoders

📄 Abstract - Improving LLM Predictions via Inter-Layer Structural Encoders

The standard practice in Large Language Models (LLMs) is to base predictions on the final-layer token representations. Recent studies, however, show that intermediate layers encode substantial information, which may contain more task-relevant features than the final-layer representations alone. Importantly, it was shown that for different tasks, different layers may be optimal. In this work we introduce Inter-Layer Structural Encoders (ILSE), a powerful structural approach to learn one effective representation from the LLM's internal layer representations all together. Central to ILSE is Cayley-Encoder, a mathematically grounded geometric encoder that leverages expander Cayley graphs for efficient inter-layer information propagation. We evaluate ILSE across 13 classification and semantic similarity tasks with 9 pre-trained LLMs ranging from 14 million to 8 billion parameters. ILSE consistently outperforms baselines and existing approaches, achieving up to 44% improvement in accuracy and 25% in similarity metrics. We further show that ILSE is data-efficient in few-shot regimes and can make small LLMs competitive with substantially larger models.

通过层间结构编码器改进大语言模型的预测能力 / Improving LLM Predictions via Inter-Layer Structural Encoders

1️⃣ 一句话总结

这篇论文提出了一种名为ILSE的新方法，它通过一个基于数学图论的编码器，巧妙地整合大语言模型中间各层的有用信息，从而让模型在各种任务上的预测更准确，甚至能让小模型媲美大模型的表现。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要