面向分布式能源协调的监督式强化学习 / Supervised Reinforcement Learning for the Coordination of Distributed Energy Resources
1️⃣ 一句话总结
受大语言模型训练方式的启发,本文提出一种“先模仿、后优化”的监督式强化学习框架,先利用现有数据让模型学会基本协调策略,再通过离线与在线两个阶段的精细调优来适应真实环境,从而高效、可靠地管理分布式能源,即使训练数据质量不高也能取得优秀效果。
The increasing integration of distributed energy resources (DERs) is crucial for power system decarbonization, yet unlocking DERs' flexibility is challenged by their inherent uncertainties and modelling complexity. As traditional optimization methods struggle with such uncertainty and complexity of DERs, reinforcement learning (RL) has emerged as a promising alternative for DER management. However, standard RL methods suffer from sample inefficiency and sub-optimality when trained from scratch. Inspired by the training paradigms in large language models, this paper proposes a Supervised Reinforcement Learning (SRL) framework for learning DER coordination policies. This framework first pre-trains a policy on demonstration data in a supervised-learning fashion, which is then further fine-tuned using RL. Furthermore, we propose a two-step fine-tuning process: offline fine-tuning for enhancing policy performance and online fine-tuning for adapting it to the real-world dynamics. Experiments demonstrate that RL implementations based on the proposed framework significantly outperform all benchmarks, achieving high cost efficiency even under low-quality demonstration data.
面向分布式能源协调的监督式强化学习 / Supervised Reinforcement Learning for the Coordination of Distributed Energy Resources
受大语言模型训练方式的启发,本文提出一种“先模仿、后优化”的监督式强化学习框架,先利用现有数据让模型学会基本协调策略,再通过离线与在线两个阶段的精细调优来适应真实环境,从而高效、可靠地管理分布式能源,即使训练数据质量不高也能取得优秀效果。
源自 arXiv: 2606.24947