菜单

关于 🐙 GitHub
arXiv 提交日期: 2025-12-22
📄 Abstract - Real2Edit2Real: Generating Robotic Demonstrations via a 3D Control Interface

Recent progress in robot learning has been driven by large-scale datasets and powerful visuomotor policy architectures, yet policy robustness remains limited by the substantial cost of collecting diverse demonstrations, particularly for spatial generalization in manipulation tasks. To reduce repetitive data collection, we present Real2Edit2Real, a framework that generates new demonstrations by bridging 3D editability with 2D visual data through a 3D control interface. Our approach first reconstructs scene geometry from multi-view RGB observations with a metric-scale 3D reconstruction model. Based on the reconstructed geometry, we perform depth-reliable 3D editing on point clouds to generate new manipulation trajectories while geometrically correcting the robot poses to recover physically consistent depth, which serves as a reliable condition for synthesizing new demonstrations. Finally, we propose a multi-conditional video generation model guided by depth as the primary control signal, together with action, edge, and ray maps, to synthesize spatially augmented multi-view manipulation videos. Experiments on four real-world manipulation tasks demonstrate that policies trained on data generated from only 1-5 source demonstrations can match or outperform those trained on 50 real-world demonstrations, improving data efficiency by up to 10-50x. Moreover, experimental results on height and texture editing demonstrate the framework's flexibility and extensibility, indicating its potential to serve as a unified data generation framework.

顶级标签: robotics systems model training
详细标签: demonstration generation 3d editing video synthesis data efficiency manipulation tasks 或 搜索:

Real2Edit2Real:通过3D控制界面生成机器人演示数据 / Real2Edit2Real: Generating Robotic Demonstrations via a 3D Control Interface


1️⃣ 一句话总结

这篇论文提出了一种名为Real2Edit2Real的新方法,它通过一个3D编辑界面,利用少量真实机器人演示视频,自动生成大量新的、多样化的训练数据,从而让机器人学习新技能时所需的数据量减少10到50倍,极大地提高了数据效率。

源自 arXiv: 2512.19402