语音理解领域统一且可复现的实验框架 / A Unified and Reproducible Experimentation Framework for Speech Understanding
1️⃣ 一句话总结
该论文提出了一个名为SURE的统一实验框架,通过标准化预测格式、后处理流程和评估方法,解决了语音理解模型在部署时因评估标准不统一导致难以比较和复现的问题,同时该框架还引入了智能辅助的训练流程转换功能,帮助将论文代码转化为统一、可复现的训练管道。
Speech foundation models and Speech LLMs have advanced speech understanding, yet deployment-oriented model selection is hindered by non-comparable evaluations caused by mismatched post-processing, and by training results that are hard to reproduce across data scales and pipelines. We present SURE, a unified experimentation framework that standardizes prediction formats, normalization, and scoring. SURE evaluates strong systems across paradigms, from conventional pipelines to Speech LLMs, on representative tasks under realistic acoustic and linguistic stressors. Beyond evaluation, SURE introduces an agent-assisted training conversion flow that maps paper and code into versioned, runnable training pipelines under a unified protocol on matched open-data subsets. Overall, SURE improves comparability and reproducibility for deployment-oriented evaluation.
语音理解领域统一且可复现的实验框架 / A Unified and Reproducible Experimentation Framework for Speech Understanding
该论文提出了一个名为SURE的统一实验框架,通过标准化预测格式、后处理流程和评估方法,解决了语音理解模型在部署时因评估标准不统一导致难以比较和复现的问题,同时该框架还引入了智能辅助的训练流程转换功能,帮助将论文代码转化为统一、可复现的训练管道。
源自 arXiv: 2605.30899