菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-25
📄 Abstract - Diagnosing Task Insensitivity in Language Agents

Large language models can serve as capable long-horizon agents, but their out-of-distribution (OOD) generalization remains weak. We identify a key source of this failure as task insensitivity: when faced with similar but distinct tasks, models might apply patterns learned during training and fail to solve the task at hand. We show that models often continue with actions aligned with the original task even when the instruction is semantically corrupted and cannot be directly answered. We further find that, when we replace the task description in a trained prompt with another similar but distinct task, the model may still output the same action. This behavior is accompanied by a consistent training-time attention drift away from task tokens and toward local observations, suggesting an optimization bias toward shortcuts. To mitigate this problem, we propose Task-Perturbed NLL Optimization, a lightweight contrastive regularizer that explicitly encourages action dependence on the task instruction. Extensive evaluations show that our intervention improves task sensitivity and OOD generalization while preserving more stable attention to task tokens.

顶级标签: llm agents model evaluation
详细标签: language agents out-of-distribution generalization task sensitivity attention drift contrastive regularization 或 搜索:

诊断语言智能体的任务不敏感性 / Diagnosing Task Insensitivity in Language Agents


1️⃣ 一句话总结

本文发现大型语言模型在作为长期任务智能体时,容易忽视任务指令差异,依赖训练中的捷径来行动,导致泛化能力差,并提出一种轻量化的对比正则化方法来增强模型对任务的敏感性和泛化性能。

源自 arXiv: 2606.26918