菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-13
📄 Abstract - DuET: Dual Execution for Test Output Prediction with Generated Code and Pseudocode

This work addresses test output prediction, a key challenge in test case generation. To improve the reliability of predicted outputs by LLMs, prior approaches generate code first to ground predictions. One grounding strategy is direct execution of generated code, but even minor errors can cause failures. To address this, we introduce LLM-based pseudocode execution, which grounds prediction on more error-resilient pseudocode and simulates execution via LLM reasoning. We further propose DuET, a dual-execution framework that combines both approaches by functional majority voting. Our analysis shows the two approaches are complementary in overcoming the limitations of direct execution suffering from code errors, and pseudocode reasoning from hallucination. On LiveCodeBench, DuET achieves the state-of-the-art performance, improving Pass@1 by 13.6 pp.

顶级标签: llm model evaluation systems
详细标签: test output prediction code generation pseudocode execution majority voting benchmark 或 搜索:

DuET:基于生成代码与伪代码双重执行的测试输出预测 / DuET: Dual Execution for Test Output Prediction with Generated Code and Pseudocode


1️⃣ 一句话总结

这篇论文提出了一个名为DuET的双重执行框架,它通过结合直接执行生成的代码和让大语言模型模拟执行伪代码这两种互补的方法,并进行多数投票,从而更可靠地预测软件测试的输出结果,显著提升了预测准确率。

源自 arXiv: 2604.11514