欢迎访问《电测与仪表》杂志社唯一官方网站

文章摘要

基于深度强化学习的微电网在线优化

On-line optimization of micro grid based on deep reinforcement learning

Received:October 14, 2020 Revised:October 23, 2020

DOI：10.19753/j.issn1001-1390.2024.04.002

英文关键词: microgrid dispatching, Q-learning, online optimization, Monte Carlo, deep reinforcement learning

基金项目:广东省自然科学基金资助项目(2018A0303131001),国家自然科学基金资助项目（51977081）

Author Name	Affiliation	E-mail
Yu Honghui^*	School of Electric Power,South China University of Technology	hohuiyu@163.com
Lin Shenghong	School of Electric Power,South China University of Technology	linsh@scut.edu.cn
Zhu Jianquan	School of Electric Power,South China University of Technology	zhujianquan@scut.edu.cn
Chen Haowu	School of Electric Power,South China University of Technology	1250046203@qq.com

Hits: 832

Download times: 253

中文摘要:

针对微电网的随机优化调度问题,提出了一种基于深度强化学习的微电网在线优化算法。利用深度神经网络近似状态-动作值函数,把蓄电池的动作离散化作为神经网络输出,然后利用非线性规划求解剩余决策变量并计算立即回报,通过Q学习算法,获取最优策略。为使得神经网络适应风光负荷的随机性,根据风电、光伏和负荷功率预测曲线及其预测误差,利用蒙特卡洛抽样生成多组训练曲线来训练神经网络；训练完成后,保存权重,根据微电网实时输入状态,神经网络能实时输出蓄电池的动作,实现微电网的在线优化调度。在风电、光伏和负荷功率发生波动的情况下与日前优化结果进行对比,验证了该算法相比于日前优化在微电网在线优化中的有效性和优越性。

英文摘要:

In view of the micro-grid random optimization scheduling problem, this paper proposes an online optimization algorithm of micro-grid based on deep reinforcement learning. The deep neural network is used to approximate the state-action value function, and the action of the battery is discretized as the output of the neural network. And then, the nonlinear programming is used to solve the remaining decision variables and calculate the immediate return, and obtain the optimal strategy through the Q-learning algorithm. In order to make the neural network adapt to the randomness of wind, photovoltaic and load power, according to the wind, photovoltaic and load power prediction curves and their prediction errors, Monte Carlo sampling is used to generate multiple sets of training curves to train the neural network. After the training is completed, the weights are saved. According to the real-time input status of the micro-grid, the neural network can output the actions of the battery in real time so as to realize the online optimal dispatching of the micro-grid. Compared with day-ahead optimization results under different fluctuations of wind power, photovoltaic and load power, the effectiveness and superiority of this algorithm in online optimization of micro-grid are verified.

View Full Text View/Add Comment Download reader