ChartVerse:通过可靠的程序化合成从头开始扩展图表推理能力 / ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch
1️⃣ 一句话总结
这篇论文提出了一个名为ChartVerse的框架,它通过创新的程序化方法自动生成复杂多样的图表和高质量的问答数据,从而有效解决了视觉语言模型在图表推理任务中训练数据不足和质量不高的问题,并成功训练出性能领先的模型。
Chart reasoning is a critical capability for Vision Language Models (VLMs). However, the development of open-source models is severely hindered by the lack of high-quality training data. Existing datasets suffer from a dual challenge: synthetic charts are often simplistic and repetitive, while the associated QA pairs are prone to hallucinations and lack the reasoning depth required for complex tasks. To bridge this gap, we propose ChartVerse, a scalable framework designed to synthesize complex charts and reliable reasoning data from scratch. (1) To address the bottleneck of simple patterns, we first introduce Rollout Posterior Entropy (RPE), a novel metric that quantifies chart complexity. Guided by RPE, we develop complexity-aware chart coder to autonomously synthesize diverse, high-complexity charts via executable programs. (2) To guarantee reasoning rigor, we develop truth-anchored inverse QA synthesis. Diverging from standard generation, we adopt an answer-first paradigm: we extract deterministic answers directly from the source code, generate questions conditional on these anchors, and enforce strict consistency verification. To further elevate difficulty and reasoning depth, we filter samples based on model fail-rate and distill high-quality Chain-of-Thought (CoT) reasoning. We curate ChartVerse-SFT-600K and ChartVerse-RL-40K using Qwen3-VL-30B-A3B-Thinking as the teacher. Experimental results demonstrate that ChartVerse-8B achieves state-of-the-art performance, notably surpassing its teacher and rivaling the stronger Qwen3-VL-32B-Thinking.
ChartVerse:通过可靠的程序化合成从头开始扩展图表推理能力 / ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch
这篇论文提出了一个名为ChartVerse的框架,它通过创新的程序化方法自动生成复杂多样的图表和高质量的问答数据,从而有效解决了视觉语言模型在图表推理任务中训练数据不足和质量不高的问题,并成功训练出性能领先的模型。
源自 arXiv: 2601.13606