菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-11
📄 Abstract - Towards Robust Speech Deepfake Detection via Human-Inspired Reasoning

The modern generative audio models can be used by an adversary in an unlawful manner, specifically, to impersonate other people to gain access to private information. To mitigate this issue, speech deepfake detection (SDD) methods started to evolve. Unfortunately, current SDD methods generally suffer from the lack of generalization to new audio domains and generators. More than that, they lack interpretability, especially human-like reasoning that would naturally explain the attribution of a given audio to the bona fide or spoof class and provide human-perceptible cues. In this paper, we propose HIR-SDD, a novel SDD framework that combines the strengths of Large Audio Language Models (LALMs) with the chain-of-thought reasoning derived from the novel proposed human-annotated dataset. Experimental evaluation demonstrates both the effectiveness of the proposed method and its ability to provide reasonable justifications for predictions.

顶级标签: audio natural language processing model evaluation
详细标签: speech deepfake detection large audio language models chain-of-thought interpretability generalization 或 搜索:

迈向基于类人推理的鲁棒语音深度伪造检测 / Towards Robust Speech Deepfake Detection via Human-Inspired Reasoning


1️⃣ 一句话总结

这篇论文提出了一种结合大型音频语言模型和类人思维链推理的新框架,不仅能更有效地检测不同来源的伪造语音,还能为判断结果提供易于人类理解的解释。

源自 arXiv: 2603.10725