菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-14
📄 Abstract - TAPIOCA: Why Task- Aware Pruning Improves OOD model Capability

Recent work has promoted task-aware layer pruning as a way to improve model performance on particular tasks, as shown by TALE. In this paper, we investigate when such improvements occur and why. We show first that, across controlled polynomial regression tasks and large language models, such pruning yields no benefit on in-distribution (ID) data but consistently improves out-of-distribution (OOD) accuracy. We further show empirically that OOD inputs induce layerwise norm and pairwise-distance profiles that deviate from the corresponding ID profiles. This leads to a geometric explanation of task-aware pruning: each task induces a task-adapted geometry, characterized empirically by the representation profiles observed on ID inputs. OOD inputs can introduce a distorted version of the task-adapted geometry. Task-aware pruning identifies layers that create or amplify this distortion; by removing them, it shifts OOD representational norms and pairwise distances toward those observed on the adapted distribution. This realigns OOD inputs with the model's task-adapted geometry and improves performance. We provide causal evidence through controlled distribution shifts and residual-scaling interventions, and demonstrate consistent behavior across model scales.

顶级标签: llm theory model evaluation
详细标签: pruning out-of-distribution representation geometry task adaptation generalization 或 搜索:

TAPIOCA:为什么任务感知剪枝能提升模型的分布外能力 / TAPIOCA: Why Task- Aware Pruning Improves OOD model Capability


1️⃣ 一句话总结

这篇论文揭示了任务感知剪枝技术虽然对分布内数据没有明显帮助,但能通过移除那些扭曲输入几何结构的网络层,让分布外数据的表征模式重新匹配任务所需的内部规律,从而显著提升模型在未知场景下的表现。

源自 arXiv: 2605.14738