拉普拉斯变换器:用拉普拉斯核重新思考线性注意力机制 / LaplacianFormer:Rethinking Linear Attention with Laplacian Kernel
1️⃣ 一句话总结
本文提出了一种名为拉普拉斯变换器的新型Transformer架构,通过使用拉普拉斯核代替传统的softmax注意力机制,在保持线性计算复杂度的同时更准确地捕捉长距离和中距离的像素交互,并搭配高效的数值算法和GPU加速,在图像识别任务上实现了更好的性能与效率平衡。
The quadratic complexity of softmax attention presents a major obstacle for scaling Transformers to high-resolution vision tasks. Existing linear attention variants often replace the softmax with Gaussian kernels to reduce complexity, but such approximations lack theoretical grounding and tend to oversuppress mid-range token interactions. We propose LaplacianFormer, a Transformer variant that employs a Laplacian kernel as a principled alternative to softmax, motivated by empirical observations and theoretical analysis. To address expressiveness degradation under low-rank approximations, we introduce a provably injective feature map that retains fine-grained token information. For efficient computation, we adopt a Nyström approximation of the kernel matrix and solve the resulting system using Newton--Schulz iteration, avoiding costly matrix inversion and SVD. We further develop custom CUDA implementations for both the kernel and solver, enabling high-throughput forward and backward passes suitable for edge deployment. Experiments on ImageNet show that LaplacianFormer achieves strong performance-efficiency trade-offs while improving attention expressiveness.
拉普拉斯变换器:用拉普拉斯核重新思考线性注意力机制 / LaplacianFormer:Rethinking Linear Attention with Laplacian Kernel
本文提出了一种名为拉普拉斯变换器的新型Transformer架构,通过使用拉普拉斯核代替传统的softmax注意力机制,在保持线性计算复杂度的同时更准确地捕捉长距离和中距离的像素交互,并搭配高效的数值算法和GPU加速,在图像识别任务上实现了更好的性能与效率平衡。
源自 arXiv: 2604.20368