NeuralMUSIC:一种用于机器人声源定位的混合神经子空间框架 / NeuralMUSIC: A Hybrid Neural-Subspace Framework for Robot Sound Source Localization
1️⃣ 一句话总结
本文提出了一种结合深度学习和经典MUSIC算法的混合框架,通过神经网络预测空间协方差矩阵并融合频率注意力机制,显著提升了机器人在噪声环境下声源定位的准确性和跨场景泛化能力。
Reliable sound source localization is fundamental to robot audition, enabling autonomous robots to perceive spatial cues and operate effectively in dynamic environments. Classical methods such as Multiple Signal Classification (MUSIC) offer strong theoretical foundations but degrade under low signal-to-noise ratios. While deep learning-based approaches achieve promising performance, they often struggle with limited generalization across conditions. To address these challenges, we propose NeuralMUSIC, a hybrid neural-subspace framework for robotic sound source localization. Specifically, a neural network first estimates the spatial covariance matrix from multichannel microphone observations. The predicted covariance is then integrated into a classical MUSIC pipeline with eigenvalue decomposition (EVD) and pseudo-spectrum computation, followed by a Frequency Attention Fusion (FAF) module to produce the final DOA estimates. To improve data efficiency, we further introduce a Self-supervised Spatial Correlation Learning (SSCL) strategy that leverages unlabeled acoustic data to capture spatial structure. Extensive experiments across different robotic tasks demonstrate that NeuralMUSIC achieves competitive localization accuracy while exhibiting improved robustness and cross-domain generalization.
NeuralMUSIC:一种用于机器人声源定位的混合神经子空间框架 / NeuralMUSIC: A Hybrid Neural-Subspace Framework for Robot Sound Source Localization
本文提出了一种结合深度学习和经典MUSIC算法的混合框架,通过神经网络预测空间协方差矩阵并融合频率注意力机制,显著提升了机器人在噪声环境下声源定位的准确性和跨场景泛化能力。
源自 arXiv: 2606.18664