菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-06-01
📄 Abstract - World-Task Factorization for Robot Learning

Robot learning must produce policies that generalize to new combinations of constraints, teammates, and environments. To achieve this, we must structurally factor the policy, which is a choice that dictates what generalizes, what requires retraining, and what remains entangled. Existing methods span a wide spectrum, from expecting structure to emerge from data scaling, to hand-designing it via hierarchies, skill libraries or learned specializations. In this paper, we study what we argue is the most fundamental factorization in robotics: separating the world from the task. We investigate the conditions under which this factorization is principled. World factors are properties of the embodied system and the environment; they exist independently of intent. Task factors are defined by the task's logic over what the world admits. We formalize this asymmetry through Bayesian model evidence: it aligns with the data-generating process, maintains high likelihood through an analytical world model, and reduces the Occam razor's penalty on task parameters. We instantiate this factorization by pairing AICON, a differentiable graph of recursive estimators and interconnections that is compositional, operates without task-specific data, and propagates cost gradients to actuators, with a compact, learned policy that modulates gradient paths. Gradients serve as the interface between the two factors: they carry world structure through the graph and task structure through costs, enabling low-dimensional learning while preserving structural generalization. We test the world/task factorization across three problems that encompass heterogeneous robots, environments, task logic and sensorimotor modalities. Our framework outperforms end-to-end baselines and analytical heuristics in all settings, generalizes zero-shot to out-of-distribution configurations, and transfers to real hardware without retraining.

顶级标签: robotics machine learning
详细标签: factorization policy generalization bayesian model evidence world model gradient modulation 或 搜索:

世界与任务的分解:面向机器人学习 / World-Task Factorization for Robot Learning


1️⃣ 一句话总结

本文提出了一种将机器人学习中的“世界模型”(包括环境和机器人自身特性)与“任务逻辑”(即具体要完成的目标)明确分离的结构化方法,通过一个可微分的组合图(AICON)来传递世界信息,并用一个小型策略网络来学习任务,从而让机器人能在不同任务、环境和硬件之间实现零样本泛化,无需重新训练。

源自 arXiv: 2606.02027