菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-17
📄 Abstract - HandwritingAgent: Language-Driven Handwriting Synthesis in Scalable Vector Space

Teaching machines to emulate natural handwriting styles remains an open challenge, as it requires synthesizing stroke sequences that dynamically vary in shape, texture, pressure and script - not only across individuals, but also within a single person's handwriting. Attempts at this challenge have largely explored deep learning methods in both online and offline settings. However, these approaches are often constrained by style-specific architectural choices, heavy reliance on large datasets, high compute costs, and a lack of flexible control over writing styles through natural language. To this end, we introduce HandwritingAgent, a language-driven agent that can synthesize natural handwriting sequences directly in Scalable Vector Graphics (SVG) format with no need for style-specific training. The agent leverages a large reasoning model to geometrically analyse and autoregressively generate target handwritten glyphs as stroke sequences in a discrete grid canvas environment. Generation is conditioned on texts provided in either conversational or non-conversational mode, along with a reference handwriting-style image. Experiments on diverse handwriting tasks spanning imitation, recognition, multi-lingual handwriting synthesis, and generation of complex handwritten maths and science expressions indicate substantial improvement in performance, with HandwritingAgent matching or surpassing state-of-the-art generative handwriting models, while providing a more efficient, controllable, and generalizable synthesis method.

顶级标签: llm agents multi-modal
详细标签: handwriting synthesis svg generation language-driven style imitation multi-lingual 或 搜索:

手写智能体:在可缩放矢量空间中基于语言驱动的手写合成 / HandwritingAgent: Language-Driven Handwriting Synthesis in Scalable Vector Space


1️⃣ 一句话总结

本文提出了一种名为HandwritingAgent的智能系统,它无需针对特定风格进行训练,仅通过自然语言指令和参考手写样例,就能在矢量图形格式中自动生成逼真、多变的手写笔画序列,并且效果优于现有方法,可广泛应用于模仿、识别、多语言乃至复杂数学公式的手写生成。

源自 arXiv: 2606.18788