Merge-Bench:利用大型语言模型解决合并冲突 / Merge-Bench: Resolve Merge Conflicts with Large Language Models
1️⃣ 一句话总结
本文构建了一个名为Merge-Bench的大规模真实合并冲突数据集,并训练了一个名为LLMergeJ的模型,利用强化学习让大型语言模型在Java代码合并中表现优异,甚至在某些任务上超越多个商用模型,但整体上最先进的模型也只能正确解决不到六成的冲突。
This paper applies machine learning to the difficult and important task of version control merging. (1) We constructed a dataset, Merge-Bench, of 7938 real-world merge conflict hunks from 1439 GitHub repositories. The ground truth is the merge resolution that developers committed to the repository. Our dataset construction methodology is scalable to arbitrary amounts of data since no manual labeling is required. (2) We trained a model, LLMergeJ, to resolve merge conflicts in Java programs. Our approach uses Group Relative Policy Optimization (GRPO), an online reinforcement learning method, to train a Large Language Model (LLM). (3) We performed two evaluations of the performance of LLMs on resolving merge conflicts. On Java programs, LLMergeJ with 14B parameters outperforms 3 commercial LLMs, trailing only Gemini 2.5 Pro. Across 11 programming languages, commercial LLM performance is largely stable from language to language. The best models correctly resolve less than 60% of merge conflicts.
Merge-Bench:利用大型语言模型解决合并冲突 / Merge-Bench: Resolve Merge Conflicts with Large Language Models
本文构建了一个名为Merge-Bench的大规模真实合并冲突数据集,并训练了一个名为LLMergeJ的模型,利用强化学习让大型语言模型在Java代码合并中表现优异,甚至在某些任务上超越多个商用模型,但整体上最先进的模型也只能正确解决不到六成的冲突。
源自 arXiv: 2605.25890