TEXTS-Diff:面向真实世界文本图像超分辨率的文本感知扩散模型 / TEXTS-Diff: TEXTS-Aware Diffusion Model for Real-World Text Image Super-Resolution
1️⃣ 一句话总结
本文提出了一个名为TEXTS-Diff的文本感知扩散模型,并构建了一个大规模真实世界文本图像数据集Real-Texts,旨在同时高质量地恢复图像背景和模糊、扭曲的文字,显著提升了复杂场景下文本图像超分辨率的整体视觉效果和文字可读性。
Real-world text image super-resolution aims to restore overall visual quality and text legibility in images suffering from diverse degradations and text distortions. However, the scarcity of text image data in existing datasets results in poor performance on text regions. In addition, datasets consisting of isolated text samples limit the quality of background reconstruction. To address these limitations, we construct Real-Texts, a large-scale, high-quality dataset collected from real-world images, which covers diverse scenarios and contains natural text instances in both Chinese and English. Additionally, we propose the TEXTS-Aware Diffusion Model (TEXTS-Diff) to achieve high-quality generation in both background and textual regions. This approach leverages abstract concepts to improve the understanding of textual elements within visual scenes and concrete text regions to enhance textual details. It mitigates distortions and hallucination artifacts commonly observed in text regions, while preserving high-quality visual scene fidelity. Extensive experiments demonstrate that our method achieves state-of-the-art performance across multiple evaluation metrics, exhibiting superior generalization ability and text restoration accuracy in complex scenarios. All the code, model, and dataset will be released.
TEXTS-Diff:面向真实世界文本图像超分辨率的文本感知扩散模型 / TEXTS-Diff: TEXTS-Aware Diffusion Model for Real-World Text Image Super-Resolution
本文提出了一个名为TEXTS-Diff的文本感知扩散模型,并构建了一个大规模真实世界文本图像数据集Real-Texts,旨在同时高质量地恢复图像背景和模糊、扭曲的文字,显著提升了复杂场景下文本图像超分辨率的整体视觉效果和文字可读性。
源自 arXiv: 2601.17340