📄
Abstract - SP-MoMamba: Superpixel-driven Mixture of State Space Experts for Efficient Image Super-Resolution
State space models (SSMs) have emerged as a powerful paradigm for efficient single-image super-resolution (SR) due to their linear complexity and long-range modeling capabilities. However, existing Mamba-based methods typically rely on data-agnostic rigid scanning, which reshapes 2D images into 1D sequences over a fixed grid, inevitably disrupting spatial-semantic topology and introducing artifacts. Inspired by the \textbf{Gestalt perceptual grouping theory}, we propose \textbf{SP-MoMamba}, a superpixel-driven mixture of state space experts designed for content-aware SR. Our core idea is to transform the traditional rigid scanning into a \textbf{semantic-level interaction} by treating superpixels as fundamental units. Specifically, we introduce the \textbf{Superpixel-driven State Space Model (SP-SSM)}, which compresses semantically homogeneous regions into high-order tokens to preserve global topological consistency. To address the conflict between fixed scanning scales and diverse semantic granularities, we develop the \textbf{Multi-Scale Superpixel Mixture of State Space Experts (MSS-MoE)}. This module utilizes a dynamic routing mechanism to adaptively assign scale-specific experts, effectively capturing multi-scale textures while reducing computational redundancy. Furthermore, to prevent the loss of high-frequency details during global abstraction, we introduce a \textbf{Local Spatial Modulation Expert (LSME)} to complement the global modeling, ensuring a precise reconstruction of sharp edges and fine structures. Extensive experiments on standard benchmarks demonstrate that SP-MoMamba achieves superior reconstruction fidelity and a more favorable efficiency-performance trade-off compared to state-of-the-art efficient SR methods.
SP-MoMamba:基于超像素驱动的状态空间专家混合模型实现高效图像超分辨率 /
SP-MoMamba: Superpixel-driven Mixture of State Space Experts for Efficient Image Super-Resolution
1️⃣ 一句话总结
本文提出了一种名为SP-MoMamba的图像超分辨率方法,它结合超像素和状态空间模型,通过将图像分割成语义一致的区域(超像素)而非传统固定网格扫描,并引入多尺度专家路由机制,从而在提升图像清晰度的同时大幅降低计算量,实现了更高效和更逼真的图像放大效果。