📄
Abstract - DocQT: Improving Document Forgery Localization Robustness via Diverse JPEG Quantization Tables
Document manipulation localization models achieve strong performance on public benchmarks yet fail to generalize to operational document workflows. We identify a critical and overlooked source of this gap: the mismatch between the narrow distribution of JPEG quantization tables used during training -restricted to standard libjpeg quality factors -and the heterogeneous compression profiles encountered in real-world insurance document pipelines. To isolate this factor, we conduct a controlled factorial study comparing two architectures with contrasting levels of quantization table awareness -FFDN [2] and Mesorch [20] -each trained under either standard quality factor augmentation (Standard-QT ) or operationally calibrated quantization tables sampled from DocQT, a quantization-table bank derived from a MAIF operational image corpus (Real-QT ), and evaluated under three recompression conditions. Training under Real-QT yields substantial localization gains on DocTamper [15] and significantly reduces the pixel-level false positive rate on authentic operational documents, but only for architectures that explicitly ingest the quantization table as input. The released DocQT quantization-table dataset and compression-reproduction material are directly available at this https URL. These results demonstrate that standard quality factor augmentation does not adequately proxy operational compression diversity, and that architectural choices explicitly conditioning on the quantization table provide a meaningful robustness advantage for real-world deployment.
DocQT:通过多样化JPEG量化表增强文档伪造定位的鲁棒性 /
DocQT: Improving Document Forgery Localization Robustness via Diverse JPEG Quantization Tables
1️⃣ 一句话总结
本文发现,训练时使用的JPEG压缩参数过于单一,导致文档伪造定位模型在真实场景中表现不佳;通过收集实际文档处理流程中的多种量化表并用于训练,并采用能直接利用这些量化表作为输入的网络架构,可以显著提升模型对伪造区域的定位准确性,并降低对真实文档的误报率。