TextOVSR:文本引导的真实世界戏曲视频超分辨率 / TextOVSR: Text-Guided Real-World Opera Video Super-Resolution
1️⃣ 一句话总结
这篇论文提出了一种名为TextOVSR的新方法,通过引入描述画面退化和内容的两种文本提示来指导模型,有效解决了老旧戏曲视频因设备限制和长期存储导致的画质模糊问题,从而能更真实、细致地恢复视频的纹理细节。
Many classic opera videos exhibit poor visual quality due to the limitations of early filming equipment and long-term degradation during storage. Although real-world video super-resolution (RWVSR) has achieved significant advances in recent years, directly applying existing methods to degraded opera videos remains challenging. The difficulties are twofold. First, accurately modeling real-world degradations is complex: simplistic combinations of classical degradation kernels fail to capture the authentic noise distribution, while methods that extract real noise patches from external datasets are prone to style mismatches that introduce visual artifacts. Second, current RWVSR methods, which rely solely on degraded image features, struggle to reconstruct realistic and detailed textures due to a lack of high-level semantic guidance. To address these issues, we propose a Text-guided Dual-Branch Opera Video Super-Resolution (TextOVSR) network, which introduces two types of textual prompts to guide the super-resolution process. Specifically, degradation-descriptive text, derived from the degradation process, is incorporated into the negative branch to constrain the solution space. Simultaneously, content-descriptive text is incorporated into a positive branch and our proposed Text-Enhanced Discriminator (TED) to provide semantic guidance for enhanced texture reconstruction. Furthermore, we design a Degradation-Robust Feature Fusion (DRF) module to facilitate cross-modal feature fusion while suppressing degradation interference. Experiments on our OperaLQ benchmark show that TextOVSR outperforms state-of-the-art methods both qualitatively and quantitatively. The code is available at this https URL.
TextOVSR:文本引导的真实世界戏曲视频超分辨率 / TextOVSR: Text-Guided Real-World Opera Video Super-Resolution
这篇论文提出了一种名为TextOVSR的新方法,通过引入描述画面退化和内容的两种文本提示来指导模型,有效解决了老旧戏曲视频因设备限制和长期存储导致的画质模糊问题,从而能更真实、细致地恢复视频的纹理细节。
源自 arXiv: 2603.15153