菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-27
📄 Abstract - The Missing Piece in Pre-trained Model Evaluation: Reward-Guided Decoding Unlocks Task-Oriented Behavior Without Parameter Updates

With the rapid progress of large language models (LLMs), reliably evaluating the capabilities of pre-trained LLMs has become increasingly important. The challenge is that base pre-trained models are optimized for next-token prediction and often fail to follow instructions or produce well-formed answers under standard prompting and direct decoding. As a result, benchmark performance can conflate model capability with decoding-induced failures to produce task-oriented outputs, while exposing such behavior often relies on costly post-training. Recent decodingonly approaches attempt to reshape output distributions, but such methods can be inefficient and brittle across open-ended tasks. To address these limitations, we propose Energy-Based Decoding (EBD), a training-free, reward-guided framework for activating task-oriented behaviors from frozen pre-trained LLMs across both open-ended and objective tasks. EBD augments decoding with an external lightweight reward model, steering generations toward high-utility responses while anchoring them to the pre-trained model prior through a reward-tilted target distribution. We show that EBD shifts base-model outputs toward more instructionfollowing behavior, increasing behavioral similarity to post-trained counterparts and enabling a fairer inference-time evaluation of accessible pre-trained-model behavior. Empirically, EBD outperforms baselines across five models and six benchmarks, improving Qwen3-8B-Base on AlpacaEval2.0 from 8.8 to 44.5, reducing Mistral-7B Math500 latency by 18.9x relative to prior decoding work, and remaining robust to reward-model size.

顶级标签: llm model evaluation system
详细标签: reward-guided decoding energy-based decoding training-free instruction following benchmark evaluation 或 搜索:

预训练模型评估中缺失的一环:奖励引导解码无需更新参数即可解锁面向任务的行为 / The Missing Piece in Pre-trained Model Evaluation: Reward-Guided Decoding Unlocks Task-Oriented Behavior Without Parameter Updates


1️⃣ 一句话总结

本文提出了一种无需训练、基于奖励引导的解码方法EBD,通过给预训练语言模型配备一个轻量级奖励模型,在不修改模型参数的前提下,就能引导模型生成更符合指令、任务导向更强的回答,从而更公平地评估模型真实能力,并显著提升下游任务表现。

源自 arXiv: 2605.28020