TY - GEN
T1 - Reinforcement learning based output-feedback control of nonlinear nonstrict feedback discrete-time systems with application to engines
AU - Shih, Peter
AU - Vance, J.
AU - Kaul, B.
AU - Jagannathan, S.
AU - Drallmeier, James A.
PY - 2007
Y1 - 2007
N2 - A novel reinforcement-learning based output-adaptive neural network (NN) controller, also referred as the adaptive-critic NN controller, is developed to track a desired trajectory for a class of complex nonlinear discrete-time systems in the presence of bounded and unknown disturbances. The controller includes an observer for estimating states and the outputs, critic, and two action NNs for generating virtual, and actual control inputs. The critic approximates certain strategic utility function and the action NNs are used to minimize both the strategic utility function and their outputs. All NN weights adapt online towards minimization of a performance index, utilizing gradient-descent based rule. A Lyapunov function proves the uniformly ultimate boundedness (UUB) of the closed-loop tracking error, weight, and observer estimation. Separation principle and certainty equivalence principles are relaxed; persistency of excitation condition and linear in the unknown parameter assumption is not needed. The performance of this adaptive critic NN controller is evaluated through simulation with the Daw engine model in lean mode. The objective is to reduce the cyclic dispersion in heat release by using the controller.
AB - A novel reinforcement-learning based output-adaptive neural network (NN) controller, also referred as the adaptive-critic NN controller, is developed to track a desired trajectory for a class of complex nonlinear discrete-time systems in the presence of bounded and unknown disturbances. The controller includes an observer for estimating states and the outputs, critic, and two action NNs for generating virtual, and actual control inputs. The critic approximates certain strategic utility function and the action NNs are used to minimize both the strategic utility function and their outputs. All NN weights adapt online towards minimization of a performance index, utilizing gradient-descent based rule. A Lyapunov function proves the uniformly ultimate boundedness (UUB) of the closed-loop tracking error, weight, and observer estimation. Separation principle and certainty equivalence principles are relaxed; persistency of excitation condition and linear in the unknown parameter assumption is not needed. The performance of this adaptive critic NN controller is evaluated through simulation with the Daw engine model in lean mode. The objective is to reduce the cyclic dispersion in heat release by using the controller.
UR - http://www.scopus.com/inward/record.url?scp=46449128044&partnerID=8YFLogxK
U2 - 10.1109/ACC.2007.4283127
DO - 10.1109/ACC.2007.4283127
M3 - Conference contribution
AN - SCOPUS:46449128044
SN - 1424409888
SN - 9781424409884
T3 - Proceedings of the American Control Conference
SP - 5106
EP - 5111
BT - Proceedings of the 2007 American Control Conference, ACC
T2 - 2007 American Control Conference, ACC
Y2 - 9 July 2007 through 13 July 2007
ER -