📄
Abstract - Characterizing MARL for Energy Control: A Multi-KPI Benchmark on the CityLearn Environment
The optimization of urban energy systems is crucial for the advancement of sustainable and resilient smart cities, which are becoming increasingly complex with multiple decision-making units. To address scalability and coordination concerns, Multi-Agent Reinforcement Learning (MARL) is a promising solution. This paper addresses the imperative need for comprehensive and reliable benchmarking of MARL algorithms on energy management tasks. CityLearn is used as a case study environment because it realistically simulates urban energy systems, incorporates multiple storage systems, and utilizes renewable energy sources. By doing so, our work sets a new standard for evaluation, conducting a comparative study across multiple key performance indicators (KPIs). This approach illuminates the key strengths and weaknesses of various algorithms, moving beyond traditional KPI averaging which often masks critical insights. Our experiments utilize widely accepted baselines such as Proximal Policy Optimization (PPO) and Soft Actor Critic (SAC), and encompass diverse training schemes including Decentralized Training with Decentralized Execution (DTDE) and Centralized Training with Decentralized Execution (CTDE) approaches and different neural network architectures. Our work also proposes novel KPIs that tackle real world implementation challenges such as individual building contribution and battery storage lifetime. Our findings show that DTDE consistently outperforms CTDE in both average and worst-case performance. Additionally, temporal dependency learning improved control on memory dependent KPIs such as ramping and battery usage, contributing to more sustainable battery operation. Results also reveal robustness to agent or resource removal, highlighting both the resilience and decentralizability of the learned policies.
面向能源控制的多智能体强化学习特性分析:基于CityLearn环境的多关键绩效指标基准测试 /
Characterizing MARL for Energy Control: A Multi-KPI Benchmark on the CityLearn Environment
1️⃣ 一句话总结
本研究通过在城市能源管理模拟环境CityLearn中引入多维度关键绩效指标进行系统性的基准测试,揭示了去中心化训练与执行模式在平均和最差性能上均优于中心化训练,并提出了能提升电池可持续性和系统鲁棒性的新评估指标。