基于解析动力学正则化的物理信息策略优化 / Physics-Informed Policy Optimization via Analytic Dynamics Regularization
1️⃣ 一句话总结
这篇论文提出了一种名为PIPER的新方法,通过在机器人强化学习训练中直接加入物理规律作为约束,让AI策略更符合真实世界的物理法则,从而显著提高了学习效率和控制精度。
Reinforcement learning (RL) has achieved strong performance in robotic control; however, state-of-the-art policy learning methods, such as actor-critic methods, still suffer from high sample complexity and often produce physically inconsistent actions. This limitation stems from neural policies implicitly rediscovering complex physics from data alone, despite accurate dynamics models being readily available in simulators. In this paper, we introduce a novel physics-informed RL framework, called PIPER, that seamlessly integrates physical constraints directly into neural policy optimization with analytical soft physics constraints. At the core of our method is the integration of a differentiable Lagrangian residual as a regularization term within the actor's objective. This residual, extracted from a robot's simulator description, subtly biases policy updates towards dynamically consistent solutions. Crucially, this physics integration is realized through an additional loss term during policy optimization, requiring no alterations to existing simulators or core RL algorithms. Extensive experiments demonstrate that our method significantly improves learning efficiency, stability, and control accuracy, establishing a new paradigm for efficient and physically consistent robotic control.
基于解析动力学正则化的物理信息策略优化 / Physics-Informed Policy Optimization via Analytic Dynamics Regularization
这篇论文提出了一种名为PIPER的新方法,通过在机器人强化学习训练中直接加入物理规律作为约束,让AI策略更符合真实世界的物理法则,从而显著提高了学习效率和控制精度。
源自 arXiv: 2603.14469