Efficient Reward Shaping for Multiagent Systems

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

In this article, we address the reward-shaping problem of large-scale multiagent systems (MASs) using inverse reinforcement learning (IRL). The learning MAS does not have prior knowledge of the cost function of the target MAS and aims to reconstruct it based on the target's demonstrations. We propose a scalable model-free IRL algorithm for a large-scale MAS, where dynamic mode decomposition (DMD) extracts dynamic modes and builds a projection matrix. This significantly reduces the data required while retaining the system's essential dynamic information. The proofs of the algorithm's convergence, stability, and nonuniqueness of the state reward weight are presented. The efficacy of our method is validated with a large-scale consensus network, by comparing the required data sizes and computational time for reward shaping with and without DMD.

Original languageEnglish
Pages (from-to)687-699
Number of pages13
JournalIEEE Transactions on Control of Network Systems
Volume12
Issue number1
DOIs
StatePublished - 2025
Externally publishedYes

Funding

This work was supported in part by the Army Research Office under Grant W911NF-20-1-0132.

Keywords

  • Data-driven control
  • dynamic mode decomposition (DMD)
  • inverse reinforcement learning (IRL)
  • large-scale system
  • optimal control

Fingerprint

Dive into the research topics of 'Efficient Reward Shaping for Multiagent Systems'. Together they form a unique fingerprint.

Cite this