Do Agent Rules Shape or Distort? Guardrails Beat Guidance in Coding Agents

📄 Abstract - Do Agent Rules Shape or Distort? Guardrails Beat Guidance in Coding Agents

Developers increasingly guide AI coding agents through natural language instruction files (e.g., this http URL, .cursorrules), yet no controlled study has measured whether these rules actually improve agent performance or which properties make a rule beneficial. We scrape 679 such files (25,532 rules) from GitHub and conduct the first large-scale empirical evaluation, running over 5,000 agent runs with a state-of-the-art coding agent on SWE-bench Verified. Rules improve performance by 7--14 percentage points, but random rules help as much as expert-curated ones -- suggesting rules work through context priming rather than specific instruction. Negative constraints ("do not refactor unrelated code") are the only individually beneficial rule type, while positive directives ("follow code style") actively hurt -- a pattern we analyze through the lens of potential-based reward shaping (PBRS). Moreover, individual rules are mostly harmful in isolation yet collectively helpful, with no degradation up to 50 rules. These findings expose a hidden reliability risk -- well-intentioned rules routinely degrade agent performance -- and provide a clear principle for safe agent configuration: constrain what agents must not do, rather than prescribing what they should.

代理规则是塑造还是扭曲？在编码智能体中，防护栏比指导更有效 / Do Agent Rules Shape or Distort? Guardrails Beat Guidance in Coding Agents

1️⃣ 一句话总结

这篇论文通过大规模实验发现，为AI编码助手设置规则文件时，禁止性的‘防护栏’规则（如‘不要重构无关代码’）能有效提升性能，而指导性的‘应该做什么’规则反而有害，揭示了当前规则配置中隐藏的可靠性风险并给出了安全配置原则。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要