菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-03-11
📄 Abstract - Frames2Residual: Spatiotemporal Decoupling for Self-Supervised Video Denoising

Self-supervised video denoising methods typically extend image-based frameworks into the temporal dimension, yet they often struggle to integrate inter-frame temporal consistency with intra-frame spatial specificity. Existing Video Blind-Spot Networks (BSNs) require noise independence by masking the center pixel, this constraint prevents the use of spatial evidence for texture recovery, thereby severing spatiotemporal correlations and causing texture loss. To address this, we propose Frames2Residual (F2R), a spatiotemporal decoupling framework that explicitly divides self-supervised training into two distinct stages: blind temporal consistency modeling and non-blind spatial texture recovery. In Stage 1, a blind temporal estimator learns inter-frame consistency using a frame-wise blind strategy, producing a temporally consistent anchor. In Stage 2, a non-blind spatial refiner leverages this anchor to safely reintroduce the center frame and recover intra-frame high-frequency spatial residuals while preserving temporal stability. Extensive experiments demonstrate that our decoupling strategy allows F2R to outperform existing self-supervised methods on both sRGB and raw video benchmarks.

顶级标签: computer vision model training video
详细标签: video denoising self-supervised learning spatiotemporal modeling blind-spot networks residual learning 或 搜索:

帧到残差:用于自监督视频去噪的时空解耦框架 / Frames2Residual: Spatiotemporal Decoupling for Self-Supervised Video Denoising


1️⃣ 一句话总结

这篇论文提出了一个名为Frames2Residual的新方法,它通过将视频去噪过程巧妙分解为‘盲时序建模’和‘非盲空间修复’两个独立阶段,有效解决了现有自监督方法在保持视频帧间连贯性与恢复单帧内清晰细节之间难以兼顾的难题,从而在无需干净数据训练的情况下实现了更高质量的视频去噪效果。

源自 arXiv: 2603.10417