📄
Abstract - EfficientSign: An Attention-Enhanced Lightweight Architecture for Indian Sign Language Recognition
How do you build a sign language recognizer that works on a phone? That question drove this work. We built EfficientSign, a lightweight model which takes EfficientNet-B0 and focuses on two attention modules (Squeeze-and-Excitation for channel focus, and a spatial attention layer that focuses on the hand gestures). We tested it against five other approaches on 12,637 images of Indian Sign Language alphabets, all 26 classes, using 5-fold cross-validation. EfficientSign achieves the accuracy of 99.94% (+/-0.05%), which matches the performance of ResNet18's 99.97% accuracy, but with 62% fewer parameters (4.2M vs 11.2M). We also experimented with feeding deep features (1,280-dimensional vectors pulled from EfficientNet-B0's pooling layer) into classical classifiers. SVM achieved the accuracy of 99.63%, Logistic Regression achieved the accuracy of 99.03% and KNN achieved accuracy of 96.33%. All of these blow past the 92% that SURF-based methods managed on a similar dataset back in 2015. Our results show that attention-enhanced learning model provides an efficient and deployable solution for ISL recognition without requiring a massive model or hand-tuned feature pipelines anymore.
EfficientSign:一种用于印度手语识别的注意力增强型轻量级架构 /
EfficientSign: An Attention-Enhanced Lightweight Architecture for Indian Sign Language Recognition
1️⃣ 一句话总结
这篇论文提出了一种名为EfficientSign的轻量级手机端手语识别模型,它通过集成两种注意力机制,在保持与大型模型相当的高准确率(约99.94%)的同时,将模型参数量减少了62%,为在手机等资源受限设备上部署高效的手语识别系统提供了可行方案。