Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning

📄 Abstract - Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning

Instruction tuning is a standard paradigm for adapting large language models (LLMs), but modern instruction datasets are large, noisy, and redundant, making full-data fine-tuning costly and often unnecessary. Existing data selection methods either build expensive gradient datastores or assign static scores from a weak proxy, largely ignoring evolving uncertainty, and thus missing a key source of LLM interpretability. We propose GRADFILTERING, an objective-agnostic, uncertainty-aware data selection framework that utilizes a small GPT-2 proxy with a LoRA ensemble and aggregates per-example gradients into a Gradient Signal-to-Noise Ratio (G-SNR) utility. Our method matches or surpasses random subsets and strong baselines in most LLM-as-a-judge evaluations as well as in human assessment. Moreover, GRADFILTERING-selected subsets converge faster than competitive filters under the same compute budget, reflecting the benefit of uncertainty-aware scoring.

基于不确定性感知梯度信噪比的数据选择方法用于指令调优 / Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning

1️⃣ 一句话总结

这篇论文提出了一种名为GRADFILTERING的新方法，它通过计算数据样本的梯度信噪比来智能筛选高质量指令数据，从而在减少训练成本的同时，让大语言模型学得更快、效果更好。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要