Orient Anything V2:统一物体朝向与旋转理解 / Orient Anything V2: Unifying Orientation and Rotation Understanding
1️⃣ 一句话总结
这篇论文提出了一个名为Orient Anything V2的增强型基础模型,它能够从单张或成对图片中,统一理解物体的三维朝向和旋转,并通过四项关键技术改进,在多种视觉任务中实现了领先的零样本性能。
This work presents Orient Anything V2, an enhanced foundation model for unified understanding of object 3D orientation and rotation from single or paired images. Building upon Orient Anything V1, which defines orientation via a single unique front face, V2 extends this capability to handle objects with diverse rotational symmetries and directly estimate relative rotations. These improvements are enabled by four key innovations: 1) Scalable 3D assets synthesized by generative models, ensuring broad category coverage and balanced data distribution; 2) An efficient, model-in-the-loop annotation system that robustly identifies 0 to N valid front faces for each object; 3) A symmetry-aware, periodic distribution fitting objective that captures all plausible front-facing orientations, effectively modeling object rotational symmetry; 4) A multi-frame architecture that directly predicts relative object rotations. Extensive experiments show that Orient Anything V2 achieves state-of-the-art zero-shot performance on orientation estimation, 6DoF pose estimation, and object symmetry recognition across 11 widely used benchmarks. The model demonstrates strong generalization, significantly broadening the applicability of orientation estimation in diverse downstream tasks.
Orient Anything V2:统一物体朝向与旋转理解 / Orient Anything V2: Unifying Orientation and Rotation Understanding
这篇论文提出了一个名为Orient Anything V2的增强型基础模型,它能够从单张或成对图片中,统一理解物体的三维朝向和旋转,并通过四项关键技术改进,在多种视觉任务中实现了领先的零样本性能。
源自 arXiv: 2601.05573