Yolo-Key-6D:一种通过关键点增强的单阶段单目6D姿态估计算法 / Yolo-Key-6D: Single Stage Monocular 6D Pose Estimation with Keypoint Enhancements
1️⃣ 一句话总结
这篇论文提出了一种名为Yolo-Key-6D的快速且准确的单阶段算法,它通过增强关键点检测来从单张彩色图片中估算物体的三维位置和朝向,在保证实时运行的同时达到了很高的精度,为机器人和增强现实等应用提供了实用的解决方案。
Estimating the 6D pose of objects from a single RGB image is a critical task for robotics and extended reality applications. However, state-of-the-art multi stage methods often suffer from high latency, making them unsuitable for real time use. In this paper, we present Yolo-Key-6D, a novel single stage, end-to-end framework for monocular 6D pose estimation designed for both speed and accuracy. Our approach enhances a YOLO based architecture by integrating an auxiliary head that regresses the 2D projections of an object's 3D bounding box corners. This keypoint detection task significantly improves the network's understanding of 3D geometry. For stable end-to-end training, we directly regress rotation using a continuous 9D representation projected to SO(3) via singular value decomposition. On the LINEMOD and LINEMOD-Occluded benchmarks, YOLO-Key-6D achieves competitive accuracy scores of 96.24% and 69.41%, respectively, with the ADD(-S) 0.1d metric, while proving itself to operate in real time. Our results demonstrate that a carefully designed single stage method can provide a practical and effective balance of performance and efficiency for real world deployment.
Yolo-Key-6D:一种通过关键点增强的单阶段单目6D姿态估计算法 / Yolo-Key-6D: Single Stage Monocular 6D Pose Estimation with Keypoint Enhancements
这篇论文提出了一种名为Yolo-Key-6D的快速且准确的单阶段算法,它通过增强关键点检测来从单张彩色图片中估算物体的三维位置和朝向,在保证实时运行的同时达到了很高的精度,为机器人和增强现实等应用提供了实用的解决方案。
源自 arXiv: 2603.03879