门控差分线性注意力:一种用于高保真医学分割的线性时间解码器 / Gated Differential Linear Attention: A Linear-Time Decoder for High-Fidelity Medical Segmentation
1️⃣ 一句话总结
这篇论文提出了一种名为PVT-GDLA的新型医学图像分割模型,它通过创新的门控差分线性注意力机制,在保持线性计算复杂度的同时,显著提升了分割的精度和边界清晰度,为临床部署提供了既高效又准确的解决方案。
Medical image segmentation requires models that preserve fine anatomical boundaries while remaining efficient for clinical deployment. While transformers capture long-range dependencies, they suffer from quadratic attention cost and large data requirements, whereas CNNs are compute-friendly yet struggle with global reasoning. Linear attention offers $\mathcal{O}(N)$ scaling, but often exhibits training instability and attention dilution, yielding diffuse maps. We introduce PVT-GDLA, a decoder-centric Transformer that restores sharp, long-range dependencies at linear time. Its core, Gated Differential Linear Attention (GDLA), computes two kernelized attention paths on complementary query/key subspaces and subtracts them with a learnable, channel-wise scale to cancel common-mode noise and amplify relevant context. A lightweight, head-specific gate injects nonlinearity and input-adaptive sparsity, mitigating attention sink, and a parallel local token-mixing branch with depthwise convolution strengthens neighboring-token interactions, improving boundary fidelity, all while retaining $\mathcal{O}(N)$ complexity and low parameter overhead. Coupled with a pretrained Pyramid Vision Transformer (PVT) encoder, PVT-GDLA achieves state-of-the-art accuracy across CT, MRI, ultrasound, and dermoscopy benchmarks under equal training budgets, with comparable parameters but lower FLOPs than CNN-, Transformer-, hybrid-, and linear-attention baselines. PVT-GDLA provides a practical path to fast, scalable, high-fidelity medical segmentation in clinical environments and other resource-constrained settings.
门控差分线性注意力:一种用于高保真医学分割的线性时间解码器 / Gated Differential Linear Attention: A Linear-Time Decoder for High-Fidelity Medical Segmentation
这篇论文提出了一种名为PVT-GDLA的新型医学图像分割模型,它通过创新的门控差分线性注意力机制,在保持线性计算复杂度的同时,显著提升了分割的精度和边界清晰度,为临床部署提供了既高效又准确的解决方案。
源自 arXiv: 2603.02727