TY - JOUR
T1 - Approximating Nash Equilibrium in Day-ahead Electricity Market Bidding with Multi-agent Deep Reinforcement Learning
AU - Du, Yan
AU - Li, Fangxing
AU - Zandi, Helia
AU - Xue, Yaosuo
N1 - Publisher Copyright:
© 2021 State Grid Electric Power Research Institute Nanjing Branch. All rights reserved.
PY - 2021/5/1
Y1 - 2021/5/1
N2 - In this paper, a day-ahead electricity market bidding problem with multiple strategic generation company (GENCO) bidders is studied. The problem is formulated as a Markov game model, where GENCO bidders interact with each other to develop their optimal day-ahead bidding strategies. Considering unobservable information in the problem, a model-free and data- driven approach, known as multi-agent deep deterministic policy gradient (MADDPG), is applied for approximating the Nash equilibrium (NE) in the above Markov game. The MADDPG algorithm has the advantage of generalization due to the automatic feature extraction ability of the deep neural networks. The algorithm is tested on an IEEE 30-bus system with three competitive GENCO bidders in both an uncongested case and a congested case. Comparisons with a truthful bidding strategy and state-of-the-art deep reinforcement learning methods including deep Q network and deep deterministic policy gradient (DDPG) demonstrate that the applied MADDPG algorithm can find a superior bidding strategy for all the market participants with increased profit gains. In addition, the comparison with a conventional-model-based method shows that the MADDPG algorithm has higher computational efficiency, which is feasible for real-world applications.
AB - In this paper, a day-ahead electricity market bidding problem with multiple strategic generation company (GENCO) bidders is studied. The problem is formulated as a Markov game model, where GENCO bidders interact with each other to develop their optimal day-ahead bidding strategies. Considering unobservable information in the problem, a model-free and data- driven approach, known as multi-agent deep deterministic policy gradient (MADDPG), is applied for approximating the Nash equilibrium (NE) in the above Markov game. The MADDPG algorithm has the advantage of generalization due to the automatic feature extraction ability of the deep neural networks. The algorithm is tested on an IEEE 30-bus system with three competitive GENCO bidders in both an uncongested case and a congested case. Comparisons with a truthful bidding strategy and state-of-the-art deep reinforcement learning methods including deep Q network and deep deterministic policy gradient (DDPG) demonstrate that the applied MADDPG algorithm can find a superior bidding strategy for all the market participants with increased profit gains. In addition, the comparison with a conventional-model-based method shows that the MADDPG algorithm has higher computational efficiency, which is feasible for real-world applications.
KW - Bidding strategy
KW - Markov game
KW - Nash equilibrium (NE)
KW - day-ahead electricity market
KW - deep reinforcement learning
KW - multi-agent deterministic policy gradient (MADDPG)
UR - http://www.scopus.com/inward/record.url?scp=85190303046&partnerID=8YFLogxK
U2 - 10.35833/MPCE.2020.000502
DO - 10.35833/MPCE.2020.000502
M3 - Article
AN - SCOPUS:85190303046
SN - 2196-5625
VL - 9
SP - 534
EP - 544
JO - Journal of Modern Power Systems and Clean Energy
JF - Journal of Modern Power Systems and Clean Energy
IS - 3
ER -