菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-01
📄 Abstract - TalkTag: Fine-Grained Morphosyntactic Error Annotation for Transcribed Speech

Fine-grained morphosyntactic error annotation is important in clinical and developmental language research, yet it is labour-intensive, expert-dependent, and difficult to scale. We present TalkTag, an LLM-based lightweight tool fine-tuned to automate CHAT-style error annotation in spoken-language transcripts. Developed under conditions of extreme data scarcity using children's narrative data, the system shows the feasibility of linguistic analysis in low-resource settings. Our evaluation demonstrates that TalkTag produces encouragingly precise annotation while effectively identifying instances where linguistic ambiguity makes automated tagging genuinely complex. In summary, with TalkTag, we provide a scalable alternative to manual error annotation and practically viable support for morphosyntactic error annotation.

顶级标签: llm natural language processing medical
详细标签: error annotation morphosyntactic analysis speech transcription low-resource fine-tuning 或 搜索:

TalkTag:面向转录语音的细粒度形态句法错误标注 / TalkTag: Fine-Grained Morphosyntactic Error Annotation for Transcribed Speech


1️⃣ 一句话总结

这篇论文介绍了一个名为TalkTag的轻量级AI工具,它能够自动、准确地标注口语转录文本中的语法错误,尤其适用于儿童语言发育研究和临床场景,在数据稀缺的情况下也能高效工作,从而替代传统依赖专家的人工标注方式。

源自 arXiv: 2606.01820