菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-20
📄 Abstract - ELSA: An ELastic SNN Inference Architecture for Efficient Neuromorphic Computing

Spiking neural networks (SNNs) exploit event-driven and addition-only computation to substantially improve efficiency for intelligent computation. A key temporal property of SNNs, elastic inference, allows outputs to emerge progressively, enabling responses to salient inputs much earlier than full evaluation. However, existing SNN-specific accelerators cannot capitalize on this property. Layer-by-layer designs emit outputs only after all layers are complete, while time-step-by-time-step designs rely on coarse-grained, layer-wise pipelines that require synchronizing all spines/tokens within a layer. This barrier prevents results from being forwarded immediately, delaying the earliest possible response and forfeiting the benefits of elastic inference. To address these challenges, we propose ELSA, a near-SRAM dataflow architecture that realizes true elastic inference through a fine-grained spine/token-wise pipeline and hardware optimizations tailored to SNNs. ELSA forwards each spine/token immediately upon production, forming a continuous streaming pipeline that substantially reduces the latency to the first response. To enhance this lightweight execution, ELSA introduces a bundled address event representation protocol to lower communication traffic of network-on-chip (NoC), and leverages mini-batch spiking Gustavson-product to cut memory access and exploit inherent sparsity. Combined with mapping and scheduling optimizations, ELSA achieves efficient, event-driven computation without compromising accuracy. Experiments show that SNNs can outperform quantized artificial neural networks (QANNs) while maintaining on-par accuracy. For a 4-bit ResNet-50, ELSA achieves 3.4$\times$ speedup and 13.6$\times$ higher energy efficiency over the SOTA QANN accelerator (ANT), and 2.9$\times$ speedup and 22.1$\times$ energy efficiency gains over the SOTA SNN accelerator (PAICORE).

顶级标签: systems machine learning
详细标签: spiking neural networks neuromorphic computing hardware architecture dataflow inference acceleration 或 搜索:

ELSA:一种用于高效神经形态计算的弹性脉冲神经网络推理架构 / ELSA: An ELastic SNN Inference Architecture for Efficient Neuromorphic Computing


1️⃣ 一句话总结

本文提出了ELSA,一种新型的脉冲神经网络(SNN)硬件加速架构,它通过精细粒度的逐脉冲流水线和硬件优化,让SNN能够像流式处理一样在生成结果后立即输出,从而大大缩短首次响应时间,并在能效和计算速度上显著超越现有最先进的脉冲神经网络和量化神经网络加速器。

源自 arXiv: 2605.20802