On generalization error of neural network models and its application to predictive control of nonlinear processes

Mohammed S. Alhajeri, Aisha Alnajdi, Fahim Abdullah, Panagiotis D. Christofides

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

In order to approximate nonlinear dynamic systems utilizing time-series data, recurrent neural networks (RNNs) and long short-term memory (LSTM) networks have frequently been used. The training error of neural networks may often be made suitably modest; however, the accuracy can be further improved by incorporating prior knowledge in the construction of machine learning-based models. Specifically, physics-based RNN modeling has yielded more reliable RNN models than traditional RNNs. Yet, a framework for constructing and assessing the generalization ability of such RNN models as well as LSTM models to be utilized in model predictive control (MPC) systems is lacking. In this work, we develop a methodological framework to quantify the generalization error bounds for partially-connected RNNs and LSTM models. The partially-connected RNN model is then utilized to predict the state evolution in a MPC scheme. We illustrate through open-loop and closed-loop simulations of a nonlinear chemical process of two reactors-in-series that the proposed approach provides a flexible framework for leveraging both prior knowledge and data, thereby improving the performance significantly when compared to a fully-connected modeling approach under Lyapunov-based MPC.

Original languageEnglish
Pages (from-to)664-679
Number of pages16
JournalChemical Engineering Research and Design
Volume189
DOIs
StatePublished - Jan 2023
Externally publishedYes

Keywords

  • Generalization error
  • Long short-term memory
  • Machine learning
  • Model predictive control
  • Nonlinear systems
  • Partially-connected RNN
  • Recurrent neural networks

Fingerprint

Dive into the research topics of 'On generalization error of neural network models and its application to predictive control of nonlinear processes'. Together they form a unique fingerprint.

Cite this