Abstract
This paper proposes model-based and model-free inverse reinforcement learning (RL) control algorithms for multiplayer game systems described by linear continuous-time differential equations. Both algorithms find the learner the same optimal control policies and trajectories as the expert, by inferring the unknown expert players' cost functions from the expert's trajectories. This paper first discusses a model-based inverse RL policy iteration that consists of 1) policy evaluation for cost matrices using a Lyapunov equation, 2) state-reward weight improvement using inverse optimal control (IOC), and 3) policy improvement using optimal control. Based on the model-based algorithm, an online data-driven inverse RL algorithm is proposed without knowing system dynamics or expert control gains. Rigorous convergence and stability analysis of these algorithms are provided. Finally, a simulation example verifies our approach.
Original language | English |
---|---|
Title of host publication | 2022 IEEE 61st Conference on Decision and Control, CDC 2022 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 2839-2844 |
Number of pages | 6 |
ISBN (Electronic) | 9781665467612 |
DOIs | |
State | Published - 2022 |
Event | 61st IEEE Conference on Decision and Control, CDC 2022 - Cancun, Mexico Duration: Dec 6 2022 → Dec 9 2022 |
Publication series
Name | Proceedings of the IEEE Conference on Decision and Control |
---|---|
Volume | 2022-December |
ISSN (Print) | 0743-1546 |
ISSN (Electronic) | 2576-2370 |
Conference
Conference | 61st IEEE Conference on Decision and Control, CDC 2022 |
---|---|
Country/Territory | Mexico |
City | Cancun |
Period | 12/6/22 → 12/9/22 |
Funding
The research was supported by Office of Naval Research Grant N00014-18-1-2221 and Army Research Office Grant W911NF-20-1-0132.