使用声学地图进行多通道重放语音检测 / Multi-Channel Replay Speech Detection using Acoustic Maps
1️⃣ 一句话总结
这篇论文提出了一种名为“声学地图”的新方法,利用多麦克风阵列捕捉声音在空间中的方向能量分布,来有效区分真人说话和录音重放攻击,其核心是一个轻量级神经网络模型,在保证高性能的同时具有物理可解释性。
Replay attacks remain a critical vulnerability for automatic speaker verification systems, particularly in real-time voice assistant applications. In this work, we propose acoustic maps as a novel spatial feature representation for replay speech detection from multi-channel recordings. Derived from classical beamforming over discrete azimuth and elevation grids, acoustic maps encode directional energy distributions that reflect physical differences between human speech radiation and loudspeaker-based replay. A lightweight convolutional neural network is designed to operate on this representation, achieving competitive performance on the ReMASC dataset with approximately 6k trainable parameters. Experimental results show that acoustic maps provide a compact and physically interpretable feature space for replay attack detection across different devices and acoustic environments.
使用声学地图进行多通道重放语音检测 / Multi-Channel Replay Speech Detection using Acoustic Maps
这篇论文提出了一种名为“声学地图”的新方法,利用多麦克风阵列捕捉声音在空间中的方向能量分布,来有效区分真人说话和录音重放攻击,其核心是一个轻量级神经网络模型,在保证高性能的同时具有物理可解释性。
源自 arXiv: 2602.16399