BlasBench:爱尔兰语语音识别的开放基准测试 / BlasBench: An Open Benchmark for Irish Speech Recognition
1️⃣ 一句话总结
这篇论文提出了一个专门用于爱尔兰语语音识别评估的开放基准测试工具BlasBench,它通过引入爱尔兰语特有的文本规范化处理和可复现的评分框架,揭示了现有模型在该语言上的性能差异和跨数据集泛化问题。
Existing multilingual benchmarks include Irish among dozens of languages but apply no Irish-aware text normalisation, leaving reliable and reproducible ASR comparison impossible. We introduce BlasBench, an open evaluation harness that provides a standalone Irish-aware normaliser preserving fadas, lenition, and eclipsis; a reproducible scoring harness and per-utterance predictions released for all evaluated runs. We pilot this by benchmarking 12 systems across four architecture families on Common Voice ga-IE and FLEURS ga-IE. All Whisper variants exceed 100% WER through insertion-driven hallucination. Microsoft Azure reaches 22.2% WER on Common Voice and 57.5% on FLEURS; the best open model, Omnilingual ASR 7B, reaches 30.65% and 39.09% respectively. Models fine-tuned on Common Voice degrade 33-43 points moving to FLEURS, while massively multilingual models degrade only 7-10 - a generalisation gap that single-dataset evaluation misses.
BlasBench:爱尔兰语语音识别的开放基准测试 / BlasBench: An Open Benchmark for Irish Speech Recognition
这篇论文提出了一个专门用于爱尔兰语语音识别评估的开放基准测试工具BlasBench,它通过引入爱尔兰语特有的文本规范化处理和可复现的评分框架,揭示了现有模型在该语言上的性能差异和跨数据集泛化问题。
源自 arXiv: 2604.10736