面向资源受限设备的高效传感器融合手势识别方法 / Efficient Sensor Fusion for Gesture Recognition on Resource-Constrained Devices
1️⃣ 一句话总结
本文提出了一种轻量级、保护隐私的手势识别系统,通过融合低分辨率飞行时间与红外热传感器数据,在微控制器上实现仅需六千余参数、功耗50毫瓦的92.3%准确率手势识别,适用于智能眼镜等可穿戴设备。
Gesture recognition is a cornerstone of Human-Computer Interaction (HCI) for smart eyewear, enabling natural and device-free control in augmented reality environments. Traditional vision-based approaches face significant challenges regarding power consumption, computational latency, and user privacy. This paper proposes a lightweight, privacy-preserving gesture recognition system based on the fusion of low-resolution Time-of-Flight (ToF) and Infrared (IR) thermal sensors. We used an 8 times 8 multizone ToF sensor (VL53L8CH) and an 8 times 8 IR array (AMG8833) to capture complementary depth and thermal cues. A compact Convolutional Neural Network (CNN) with a specialized grouped-convolution architecture is designed to fuse these modalities efficiently on a microcontroller (MCU). Experimental results on a custom dataset of 7 static gestures, validated via k-fold cross-validation, demonstrate that the proposed fusion strategy significantly outperforms single-sensor baselines with an accuracy of 92.3% and a macro F1-score of 0.93. Finally, on-device benchmarks on STM32F4 and STM32H7 MCUs confirm the system's suitability for resource-constrained wearables, requiring only 6,343 parameters and achieving millisecond-level inference latency with a total system power of 50 mW.
面向资源受限设备的高效传感器融合手势识别方法 / Efficient Sensor Fusion for Gesture Recognition on Resource-Constrained Devices
本文提出了一种轻量级、保护隐私的手势识别系统,通过融合低分辨率飞行时间与红外热传感器数据,在微控制器上实现仅需六千余参数、功耗50毫瓦的92.3%准确率手势识别,适用于智能眼镜等可穿戴设备。
源自 arXiv: 2605.13462