MM-ReCoder: Advancing Chart-to-Code Generation with Reinforcement Learning and Self-Correction

📄 Abstract - MM-ReCoder: Advancing Chart-to-Code Generation with Reinforcement Learning and Self-Correction

Multimodal Large Language Models (MLLMs) have recently demonstrated promising capabilities in multimodal coding tasks such as chart-to-code generation. However, existing methods primarily rely on supervised fine-tuning (SFT), which requires the model to learn code patterns through chart-code pairs but does not expose the model to a code execution environment. Moreover, while self-correction through execution feedback offers a potential route to improve coding quality, even state-of-the-art MLLMs have been shown to struggle with effective self-correction. In this work, we introduce MM-ReCoder, a chart-to-code generation model trained with reinforcement learning (RL) and equipped with self-correction ability. We propose a two-stage multi-turn self-correction RL strategy based on Group Relative Policy Optimization (GRPO). The first stage enhances the model's self-correction ability via rolling out a shared first turn, while the second stage improves the coding capability with full-trajectory optimization. MM-ReCoder learns to produce more accurate and executable code through the interaction with the environment and by iteratively correcting its own outputs. Our results on three chart-to-code benchmarks demonstrate the state-of-the-art performance of MM-ReCoder.

MM-ReCoder：利用强化学习和自我校正技术推进图表到代码的生成 / MM-ReCoder: Advancing Chart-to-Code Generation with Reinforcement Learning and Self-Correction

1️⃣ 一句话总结

这篇论文提出了一个名为MM-ReCoder的新模型，它通过结合强化学习和多轮自我校正策略，让AI在将图表转换为代码的任务中，能够通过与执行环境的交互不断修正错误，从而生成更准确、可执行的代码，并在多个基准测试中取得了领先的性能。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要