菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-04-06
📄 Abstract - Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling

Recent large language models have shifted SVG generation from differentiable rendering optimization to autoregressive program synthesis. However, existing approaches still rely on generic byte-level tokenization inherited from natural language processing, which poorly reflects the geometric structure of vector graphics. Numerical coordinates are fragmented into discrete symbols, destroying spatial relationships and introducing severe token redundancy, often leading to coordinate hallucination and inefficient long-sequence generation. To address these challenges, we propose HiVG, a hierarchical SVG tokenization framework tailored for autoregressive vector graphics generation. HiVG decomposes raw SVG strings into structured \textit{atomic tokens} and further compresses executable command--parameter groups into geometry-constrained \textit{segment tokens}, substantially improving sequence efficiency while preserving syntactic validity. To further mitigate spatial mismatch, we introduce a Hierarchical Mean--Noise (HMN) initialization strategy that injects numerical ordering signals and semantic priors into new token embeddings. Combined with a curriculum training paradigm that progressively increases program complexity, HiVG enables more stable learning of executable SVG programs. Extensive experiments on both text-to-SVG and image-to-SVG tasks demonstrate improved generation fidelity, spatial consistency, and sequence efficiency compared with conventional tokenization schemes. Our code is publicly available at this https URL

顶级标签: natural language processing multi-modal model training
详细标签: svg generation tokenization vector graphics program synthesis autoregressive modeling 或 搜索:

分层SVG标记化:学习用于可缩放矢量图形建模的紧凑视觉程序 / Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling


1️⃣ 一句话总结

这篇论文提出了一种名为HiVG的新方法,它通过一种分层的标记化技术,让AI在生成矢量图形(如SVG格式的图标或插图)时,能更高效、更准确地理解和构建图形的几何结构,从而减少错误并提升生成质量。

源自 arXiv: 2604.05072