菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-05-28
📄 Abstract - AnomalyAgent: Training-Free Agentic Models for Zero-/Few-Shot Anomaly Detection

Benefiting from generalizability of vision-language models (VLMs) such as CLIP, many zero-/few-shot anomaly detection (AD) approaches have achieved impressive detection performance across various datasets. Nevertheless, they require substantial training on large auxiliary datasets to adapt VLMs to anomaly detection, and their inference largely relies on visual-text embedding similarity-based anomaly scores, lacking reasoning abilities to detect complex anomalies that require in-depth contextual understanding. To address this limitation, we propose \textbf{AnomalyAgent}, a novel training-free, agentic framework that leverages the advanced reasoning and generalization capabilities of multimodal large language models (MLLMs) for anomaly detection. The key ingredients include \textbf{1)} a comprehensive anomaly-centric toolset that enables adaptive MLLM-driven, agentic anomaly reasoning in zero-shot settings, and \textbf{2)} a customized memory module that grounds anomaly reasoning with few-shot, in-context reference examples. We extend evaluation beyond the detection of simple anomalies (e.g., surface defects like cracks and dents and clear lesions) in widely used benchmarks to more diverse types of anomalies such as logical/contextual anomalies in logistics and manufacturing settings. Extensive experiment results demonstrate that our AnomalyAgent achieves substantially better performance compared to training-free VLM-based AD and generic agentic methods, highlighting its superior generalization capability in both zero-shot and few-shot anomaly detection settings. The code implementation can be find at this address.

顶级标签: multi-modal machine learning
详细标签: anomaly detection zero-shot learning few-shot learning vision-language models agentic framework 或 搜索:

AnomalyAgent:用于零样本/少样本异常检测的无训练智能体模型 / AnomalyAgent: Training-Free Agentic Models for Zero-/Few-Shot Anomaly Detection


1️⃣ 一句话总结

本文提出了一种名为AnomalyAgent的新型智能体框架,它利用多模态大语言模型的推理能力,无需额外训练即可检测简单和复杂的异常(如逻辑或上下文异常),相比传统方法在零样本和少样本场景下泛化能力更强。

源自 arXiv: 2605.30140