基于Koopman算子的代理模型用于强化学习控制瑞利-贝纳德对流 / Koopman-based surrogate modeling for reinforcement-learning-control of Rayleigh-Benard convection
1️⃣ 一句话总结
这篇论文提出了一种利用线性循环自编码器网络作为快速代理模型的方法,来加速强化学习对流体系统的控制训练,并通过结合策略感知训练与直接数值模拟,在保证控制性能的同时将训练时间减少了40%以上。
Training reinforcement learning (RL) agents to control fluid dynamics systems is computationally expensive due to the high cost of direct numerical simulations (DNS) of the governing equations. Surrogate models offer a promising alternative by approximating the dynamics at a fraction of the computational cost, but their feasibility as training environments for RL is limited by distribution shifts, as policies induce state distributions not covered by the surrogate training data. In this work, we investigate the use of Linear Recurrent Autoencoder Networks (LRANs) for accelerating RL-based control of 2D Rayleigh-Bénard convection. We evaluate two training strategies: a surrogate trained on precomputed data generated with random actions, and a policy-aware surrogate trained iteratively using data collected from an evolving policy. Our results show that while surrogate-only training leads to reduced control performance, combining surrogates with DNS in a pretraining scheme recovers state-of-the-art performance while reducing training time by more than 40%. We demonstrate that policy-aware training mitigates the effects of distribution shift, enabling more accurate predictions in policy-relevant regions of the state space.
基于Koopman算子的代理模型用于强化学习控制瑞利-贝纳德对流 / Koopman-based surrogate modeling for reinforcement-learning-control of Rayleigh-Benard convection
这篇论文提出了一种利用线性循环自编码器网络作为快速代理模型的方法,来加速强化学习对流体系统的控制训练,并通过结合策略感知训练与直接数值模拟,在保证控制性能的同时将训练时间减少了40%以上。
源自 arXiv: 2603.28074