IAD-Unify:一个用于工业异常分割、理解与生成的区域接地统一模型 / IAD-Unify: A Region-Grounded Unified Model for Industrial Anomaly Segmentation, Understanding, and Generation
1️⃣ 一句话总结
这篇论文提出了一个名为IAD-Unify的统一模型,它能够同时完成工业缺陷的定位分割、用自然语言解释缺陷原因,以及根据指令生成逼真的缺陷图像,并通过一个大型数据集验证了其有效性。
Real-world industrial inspection requires not only localizing defects, but also explaining them in natural language and generating controlled defect edits. However, existing approaches fail to jointly support all three capabilities within a unified framework and evaluation protocol. We propose IAD-Unify, a dual-encoder unified framework in which a frozen DINOv2-based region expert supplies precise anomaly evidence to a shared Qwen3.5-4B vision-language backbone via lightweight token injection, jointly enabling anomaly segmentation, region-grounded understanding, and mask-guided generation. To enable unified evaluation, we further construct Anomaly-56K, a comprehensive unified multi-task IAD evaluation platform, spanning 59,916 images across 24 categories and 104 defect variants. Controlled ablations yield four findings: (i) region grounding is the decisive mechanism for understanding, removing it degrades location accuracy by >76 pp; (ii) predicted-region performance closely matches oracle, confirming deployment viability; (iii) region-grounded generation achieves the best full-image fidelity and masked-region perceptual quality; and (iv) pre-initialized joint training improves understanding at negligible generation cost (-0.16 dB). IAD-Unify further achieves strong performance on the MMAD benchmark, including categories unseen during training, demonstrating robust cross-category generalization.
IAD-Unify:一个用于工业异常分割、理解与生成的区域接地统一模型 / IAD-Unify: A Region-Grounded Unified Model for Industrial Anomaly Segmentation, Understanding, and Generation
这篇论文提出了一个名为IAD-Unify的统一模型,它能够同时完成工业缺陷的定位分割、用自然语言解释缺陷原因,以及根据指令生成逼真的缺陷图像,并通过一个大型数据集验证了其有效性。
源自 arXiv: 2604.12440