Efficient Reward-Shaping for Multiagent Systems

Vrushabh S. Donge, Bosen Lian, Frank L. Lewis, Ali Davoudi

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

We address the reward-shaping problem of large-scale multiagent systems (MASs) using inverse reinforcement learning (RL). The learning MAS does not have prior knowledge of the cost function of the target MAS, and aims to reconstruct it based on the target's demonstrations. We propose a scalable model-free inverse RL (IRL) algorithm for large-scale MAS, where dynamic mode decomposition (DMD) extracts dynamic modes and builds a projection matrix. This significantly reduces the data required while retaining the system's essential dynamic information. The proofs of the algorithm's convergence, stability, and non-uniqueness of the state-reward weight are presented. The efficacy of our method is validated with a large-scale consensus network, by comparing the required data sizes and computational time for reward-shaping with and without DMD.

Original languageEnglish
Pages (from-to)1-12
Number of pages12
JournalIEEE Transactions on Control of Network Systems
DOIs
StateAccepted/In press - 2024
Externally publishedYes

Keywords

  • Artificial neural networks
  • Control systems
  • Data-driven control
  • Dimensionality reduction
  • Dynamic mode decomposition
  • Heuristic algorithms
  • Inverse reinforcement learning
  • Large-scale system
  • Network systems
  • Optimal control
  • Optimal control
  • Stability criteria

Fingerprint

Dive into the research topics of 'Efficient Reward-Shaping for Multiagent Systems'. Together they form a unique fingerprint.

Cite this