Automatic Speech Recognition for Documenting Endangered Languages: Case Study of Ikema Miyakoan

📄 Abstract - Automatic Speech Recognition for Documenting Endangered Languages: Case Study of Ikema Miyakoan

Language endangerment poses a major challenge to linguistic diversity worldwide, and technological advances have opened new avenues for documentation and revitalization. Among these, automatic speech recognition (ASR) has shown increasing potential to assist in the transcription of endangered language data. This study focuses on Ikema, a severely endangered Ryukyuan language spoken in Okinawa, Japan, with approximately 1,300 remaining speakers, most of whom are over 60 years old. We present an ongoing effort to develop an ASR system for Ikema based on field recordings. Specifically, we (1) construct a {\totaldatasethours}-hour speech corpus from field recordings, (2) train an ASR model that achieves a character error rate as low as 15\%, and (3) evaluate the impact of ASR assistance on the efficiency of speech transcription. Our results demonstrate that ASR integration can substantially reduce transcription time and cognitive load, offering a practical pathway toward scalable, technology-supported documentation of endangered languages.

自动语音识别在濒危语言记录中的应用：以池间宫古语为例 / Automatic Speech Recognition for Documenting Endangered Languages: Case Study of Ikema Miyakoan

1️⃣ 一句话总结

这项研究通过为日本濒危的池间宫古语开发一个自动语音识别系统，证明了该技术能显著提高语言记录的效率，为抢救濒危语言提供了一种实用的技术辅助方案。

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要