菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-20
📄 Abstract - Manga109-v2026: Revisiting Manga109 Annotations for Modern Manga Understanding

Manga is a culturally distinctive multimodal medium and one of the most influential forms of Japanese popular culture. As AI systems increasingly target manga understanding, OCR, and translation, Manga109 has become a foundational dataset for manga-related AI research. However, the current Manga109 dataset contains transcription errors and coarse annotations, which do not align well with modern OCR and multimodal manga understanding tasks. In this work, we revisit the dialogue text annotations of Manga109 and identify five categories of annotation issues, including transcription errors, missing text regions, overlapping dialogue and onomatopoeia, and under-segmented speech balloons. To address these issues, we combine OCR-based issue detection and manual revision to construct Manga109-v2026, revising approximately 29,000 dialogue annotations. Our revisions better align Manga109 with modern OCR and multimodal manga understanding systems while preserving expressive structures characteristic of manga.

顶级标签: multi-modal data computer vision
详细标签: dataset annotation refinement ocr manga understanding error correction 或 搜索:

Manga109-v2026:重新审视现代漫画理解的Manga109标注 / Manga109-v2026: Revisiting Manga109 Annotations for Modern Manga Understanding


1️⃣ 一句话总结

本论文针对漫画研究领域的基础数据集Manga109,系统梳理了其中存在的转录错误、文本遗漏、区域重叠等五类标注问题,并通过自动检测与人工修正相结合的方法,修订了约29,000条对话标注,最终推出新版Manga109-v2026,使其更适配当前先进的OCR和多模态漫画理解技术。

源自 arXiv: 2605.21182