YOLO26:架构全面概述与关键改进 / YOLO26: A Comprehensive Architecture Overview and Key Improvements
1️⃣ 一句话总结
这篇论文首次深入解析了最新版目标检测模型YOLO26的核心架构,并指出其通过移除DFL损失、引入无NMS推理等多项创新技术,显著提升了在CPU等边缘设备上的运行速度,旨在巩固YOLO系列在计算机视觉领域的领先地位。
You Only Look Once (YOLO) has been the prominent model for computer vision in deep learning for a decade. This study explores the novel aspects of YOLO26, the most recent version in the YOLO series. The elimination of Distribution Focal Loss (DFL), implementation of End-to-End NMS-Free Inference, introduction of ProgLoss + Small-Target-Aware Label Assignment (STAL), and use of the MuSGD optimizer are the primary enhancements designed to improve inference speed, which is claimed to achieve a 43% boost in CPU mode. This is designed to allow YOLO26 to attain real-time performance on edge devices or those without GPUs. Additionally, YOLO26 offers improvements in many computer vision tasks, including instance segmentation, pose estimation, and oriented bounding box (OBB) decoding. We aim for this effort to provide more value than just consolidating information already included in the existing technical documentation. Therefore, we performed a rigorous architectural investigation into YOLO26, mostly using the source code available in its GitHub repository and its official documentation. The authentic and detailed operational mechanisms of YOLO26 are inside the source code, which is seldom extracted by others. The YOLO26 architectural diagram is shown as the outcome of the investigation. This study is, to our knowledge, the first one presenting the CNN-based YOLO26 architecture, which is the core of YOLO26. Our objective is to provide a precise architectural comprehension of YOLO26 for researchers and developers aspiring to enhance the YOLO model, ensuring it remains the leading deep learning model in computer vision.
YOLO26:架构全面概述与关键改进 / YOLO26: A Comprehensive Architecture Overview and Key Improvements
这篇论文首次深入解析了最新版目标检测模型YOLO26的核心架构,并指出其通过移除DFL损失、引入无NMS推理等多项创新技术,显著提升了在CPU等边缘设备上的运行速度,旨在巩固YOLO系列在计算机视觉领域的领先地位。
源自 arXiv: 2602.14582