HI-TransPA: Hearing Impairments Translation Personal Assistant

📄 Abstract - HI-TransPA: Hearing Impairments Translation Personal Assistant

Hearing-impaired individuals often face significant barriers in daily communication due to the inherent challenges of producing clear speech. To address this, we introduce the Omni-Model paradigm into assistive technology and present HI-TransPA, an instruction-driven audio-visual personal assistant. The model fuses indistinct speech with lip dynamics, enabling both translation and dialogue within a single multimodal framework. To address the distinctive pronunciation patterns of hearing-impaired speech and the limited adaptability of existing models, we develop a multimodal preprocessing and curation pipeline that detects facial landmarks, stabilizes the lip region, and quantitatively evaluates sample quality. These quality scores guide a curriculum learning strategy that first trains on clean, high-confidence samples and progressively incorporates harder cases to strengthen model robustness. Architecturally, we employs a novel unified 3D-Resampler to efficiently encode the lip dynamics, which is critical for accurate interpretation. Experiments on purpose-built HI-Dialogue dataset show that HI-TransPA achieves state-of-the-art performance in both literal accuracy and semantic fidelity. Our work establishes a foundation for applying Omni-Models to assistive communication technology, providing an end-to-end modeling framework and essential processing tools for future research.

📄 论文总结

听力障碍翻译个人助手 / HI-TransPA: Hearing Impairments Translation Personal Assistant

1️⃣ 一句话总结

这项研究开发了一个名为HI-TransPA的多模态AI助手，它通过结合听障人士模糊的语音和唇部动态，在一个统一框架内实现精准的语音翻译和对话，有效提升了听障人士的日常沟通能力。

← 返回列表

菜单

🤖 AI 深度阅读

📄 论文总结

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

🤖 AI 深度阅读

📄 论文总结

1️⃣ 一句话总结

获取最新论文摘要