菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-03
📄 Abstract - Gated Differential Linear Attention: A Linear-Time Decoder for High-Fidelity Medical Segmentation

Medical image segmentation requires models that preserve fine anatomical boundaries while remaining efficient for clinical deployment. While transformers capture long-range dependencies, they suffer from quadratic attention cost and large data requirements, whereas CNNs are compute-friendly yet struggle with global reasoning. Linear attention offers $\mathcal{O}(N)$ scaling, but often exhibits training instability and attention dilution, yielding diffuse maps. We introduce PVT-GDLA, a decoder-centric Transformer that restores sharp, long-range dependencies at linear time. Its core, Gated Differential Linear Attention (GDLA), computes two kernelized attention paths on complementary query/key subspaces and subtracts them with a learnable, channel-wise scale to cancel common-mode noise and amplify relevant context. A lightweight, head-specific gate injects nonlinearity and input-adaptive sparsity, mitigating attention sink, and a parallel local token-mixing branch with depthwise convolution strengthens neighboring-token interactions, improving boundary fidelity, all while retaining $\mathcal{O}(N)$ complexity and low parameter overhead. Coupled with a pretrained Pyramid Vision Transformer (PVT) encoder, PVT-GDLA achieves state-of-the-art accuracy across CT, MRI, ultrasound, and dermoscopy benchmarks under equal training budgets, with comparable parameters but lower FLOPs than CNN-, Transformer-, hybrid-, and linear-attention baselines. PVT-GDLA provides a practical path to fast, scalable, high-fidelity medical segmentation in clinical environments and other resource-constrained settings.

顶级标签: medical computer vision model training
详细标签: medical image segmentation linear attention transformer efficient architecture boundary preservation 或 搜索:

门控差分线性注意力:一种用于高保真医学分割的线性时间解码器 / Gated Differential Linear Attention: A Linear-Time Decoder for High-Fidelity Medical Segmentation


1️⃣ 一句话总结

这篇论文提出了一种名为PVT-GDLA的新型医学图像分割模型,它通过创新的门控差分线性注意力机制,在保持线性计算复杂度的同时,显著提升了分割的精度和边界清晰度,为临床部署提供了既高效又准确的解决方案。

源自 arXiv: 2603.02727