菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-02
📄 Abstract - HumanX: Toward Agile and Generalizable Humanoid Interaction Skills from Human Videos

Enabling humanoid robots to perform agile and adaptive interactive tasks has long been a core challenge in robotics. Current approaches are bottlenecked by either the scarcity of realistic interaction data or the need for meticulous, task-specific reward engineering, which limits their scalability. To narrow this gap, we present HumanX, a full-stack framework that compiles human video into generalizable, real-world interaction skills for humanoids, without task-specific rewards. HumanX integrates two co-designed components: XGen, a data generation pipeline that synthesizes diverse and physically plausible robot interaction data from video while supporting scalable data augmentation; and XMimic, a unified imitation learning framework that learns generalizable interaction skills. Evaluated across five distinct domains--basketball, football, badminton, cargo pickup, and reactive fighting--HumanX successfully acquires 10 different skills and transfers them zero-shot to a physical Unitree G1 humanoid. The learned capabilities include complex maneuvers such as pump-fake turnaround fadeaway jumpshots without any external perception, as well as interactive tasks like sustained human-robot passing sequences over 10 consecutive cycles--learned from a single video demonstration. Our experiments show that HumanX achieves over 8 times higher generalization success than prior methods, demonstrating a scalable and task-agnostic pathway for learning versatile, real-world robot interactive skills.

顶级标签: robotics agents model training
详细标签: humanoid robots imitation learning skill transfer data generation zero-shot transfer 或 搜索:

HumanX:从人类视频中学习敏捷且可泛化的人形机器人交互技能 / HumanX: Toward Agile and Generalizable Humanoid Interaction Skills from Human Videos


1️⃣ 一句话总结

这篇论文提出了一个名为HumanX的完整框架,它能够直接从人类视频中学习人形机器人的交互技能,无需为特定任务设计复杂的奖励函数,并在多个运动与交互任务中实现了远超以往方法的泛化能力,成功将技能零样本迁移到真实机器人上。

源自 arXiv: 2602.02473