菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-08
📄 Abstract - Optical Music Recognition for Real-World Manuscripts with Synthetic Data

Optical Music Recognition (OMR) has seen major progress in model design, with end-to-end methods now capable of recognising notation at all levels of complexity. However, the impact of this progress has been limited by the visual domains of available training datasets, which are largely born-digital. Existing large collections of sheet music in libraries and other heritage institutions contain predominantly manuscripts, whose visual domains are highly diverse and different, so existing OMR systems fail when applied in the real world. These institutions are often resource-constrained, so large in-domain datasets cannot be expected. We provide a first baseline on real-world manuscripts with complex piano notation in the resource-constrained scenario. Using fine-grained music notation graph (MuNG) annotations and the Smashcima synthesis tool, we then show that while some direct transcriptions of in-domain data remain essential, domain adaptation using synthetic musical manuscript images brings significant improvement. Furthermore, the symbols used do not need to be in-domain, so the expensive fine-grained annotation can be avoided. We thus bring OMR closer to one of its stated goals: preserving and promoting musical cultural heritage.

顶级标签: computer vision machine learning aigc
详细标签: optical music recognition synthetic data domain adaptation handwritten music cultural heritage 或 搜索:

利用合成数据实现真实手稿的光学乐谱识别 / Optical Music Recognition for Real-World Manuscripts with Synthetic Data


1️⃣ 一句话总结

本文针对真实世界中的手写乐谱(如图书馆珍藏手稿)由于视觉风格多样且缺乏训练数据而导致现有光学乐谱识别系统失效的问题,提出了一种结合合成图像和少量真实标注数据的低成本解决方案,显著提升了识别效果,向保护音乐文化遗产的目标迈进了一大步。

源自 arXiv: 2606.09479