On-policy Approximate Dynamic Programming for Optimal Control of non-linear systems

K. Shalini, D. Vrushabh, K. Sonam

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Optimal control theory deals with finding the policy that minimizes the discounted infinite horizon quadratic cost function. For finding the optimal control policy, the solution of the Hamilton-Jacobi-Bellman (HJB) equation must be found i.e. the value function which satisfies the Bellman equation. However, the HJB is a partial differential equation that is difficult to solve for a nonlinear system. The paper employs the approximate dynamic programming method to solve the HJB equation for the deterministic nonlinear discrete-time systems in continuous state and action space. The approximate solution of the HJB is found by the policy iteration algorithm which has the framework of actor-critic architecture. The control policy and value function are approximated using function approximators such as neural network represented in the form of linearly independent basis function. The gradient descent optimization algorithm is employed to tune the weights of the actor and critic network. The control algorithm is implemented for cart pole inverted pendulum system, the effectiveness of this approach is provided in simulations.

Original languageEnglish
Title of host publication7th International Conference on Control, Decision and Information Technologies, CoDIT 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1058-1062
Number of pages5
ISBN (Electronic)9781728159539
DOIs
StatePublished - Jun 29 2020
Externally publishedYes
Event7th International Conference on Control, Decision and Information Technologies, CoDIT 2020 - Prague, Czech Republic
Duration: Jun 29 2020Jul 2 2020

Publication series

Name7th International Conference on Control, Decision and Information Technologies, CoDIT 2020

Conference

Conference7th International Conference on Control, Decision and Information Technologies, CoDIT 2020
Country/TerritoryCzech Republic
CityPrague
Period06/29/2007/2/20

Keywords

  • Approximate Dynamic Programming (ADP)
  • Gradient Descent
  • Hamilton-Jacobi-Bellman (HJB)
  • Optimal Control

Fingerprint

Dive into the research topics of 'On-policy Approximate Dynamic Programming for Optimal Control of non-linear systems'. Together they form a unique fingerprint.

Cite this