菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-14
📄 Abstract - CoDe-R: Refining Decompiler Output with LLMs via Rationale Guidance and Adaptive Inference

Binary decompilation is a critical reverse engineering task aimed at reconstructing high-level source code from stripped executables. Although Large Language Models (LLMs) have recently shown promise, they often suffer from "logical hallucinations" and "semantic misalignment" due to the irreversible semantic loss during compilation, resulting in generated code that fails to re-execute. In this study, we propose Cognitive Decompiler Refinement with Robustness (CoDe-R), a lightweight two-stage code refinement framework. The first stage introduces Semantic Cognitive Enhancement (SCE), a Rationale-Guided Semantic Injection strategy that trains the model to recover high-level algorithmic intent alongside code. The second stage introduces a Dynamic Dual-Path Fallback (DDPF) mechanism during inference, which adaptively balances semantic recovery and syntactic stability via a hybrid verification strategy. Evaluation on the HumanEval-Decompile benchmark demonstrates that CoDe-R (using a 1.3B backbone) establishes a new State-of-the-Art (SOTA) in the lightweight regime. Notably, it is the first 1.3B model to exceed an Average Re-executability Rate of 50.00%, significantly outperforming the baseline and effectively bridging the gap between efficient models and expert-level performance. Our code is available at this https URL.

顶级标签: llm systems model training
详细标签: binary decompilation code refinement rationale guidance adaptive inference re-executability 或 搜索:

CoDe-R:通过逻辑引导与自适应推理利用大语言模型精炼反编译器输出 / CoDe-R: Refining Decompiler Output with LLMs via Rationale Guidance and Adaptive Inference


1️⃣ 一句话总结

这篇论文提出了一个名为CoDe-R的轻量级两阶段代码精炼框架,它通过引导模型理解算法意图并结合自适应验证策略,显著提升了从二进制文件反编译生成源代码的准确性和可执行性,在轻量级模型中取得了领先的性能。

源自 arXiv: 2604.12913