菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-07
📄 Abstract - On the Role of Fault Localization Context for LLM-Based Program Repair

Fault Localization (FL) is a key component of Large Language Model (LLM)-based Automated Program Repair (APR), yet its impact remains underexplored. In particular, it is unclear how much localization is needed, whether additional context beyond the predicted buggy location is beneficial, and how such context should be retrieved. We conduct a large-scale empirical study on 500 SWE-bench Verified instances using GPT-5-mini, evaluating 61 configurations that vary file-level, element-level, and line-level context. Our results show that more context does not consistently improve repair performance. File-level localization is the dominant factor, yielding a 15-17x improvement over a no-file baseline. Expanding file context is often associated with improved performance, with successful repairs most commonly observed in configurations with approximately 6-10 relevant files. Element-level context expansion provides conditional gains that depend strongly on the file context quality, while line-level context expansion frequently degrades performance due to noise amplification. LLM-based retrieval generally outperforms structural heuristics while using fewer files and tokens. Overall, the most effective FL context strategy typically combines a broad semantic understanding at higher abstraction levels with precise line-level localization. These findings challenge our assumption that increasing the localization context uniformly improves APR, and provide practical guidance for designing LLM-based FL strategies.

顶级标签: llm systems model evaluation
详细标签: automated program repair fault localization software engineering empirical study context retrieval 或 搜索:

论故障定位上下文在基于大语言模型的程序修复中的作用 / On the Role of Fault Localization Context for LLM-Based Program Repair


1️⃣ 一句话总结

这项研究发现,在利用大语言模型自动修复程序时,并非提供的故障定位上下文越多越好,关键在于结合高层次语义理解和精确的行级定位,其中文件级上下文是提升修复效果的最主要因素。

源自 arXiv: 2604.05481