Implications of stop-and-go traffic on training learning-based car-following control,☆

Anye Zhou, Srinivas Peeta, Hao Zhou, Jorge Laval, Zejiang Wang, Adian Cook

Research output: Contribution to journalArticlepeer-review

Abstract

Learning-based car-following control (LCC) of connected and autonomous vehicles (CAVs) is gaining significant attention with the advancement of computing power and data accessibility. While the flexibility and large model capacity of model-free architecture enable LCC to potentially outperform the model-based car-following (CF) model in improving traffic efficiency and mitigating congestion, the generalizability of LCC for traffic conditions different from the training environment/dataset is not well-understood. This study seeks to explore the impact of stop-and-go traffic in the training dataset on the generalizability of LCC. It uses the characteristics of lead vehicle trajectories to describe stop-and-go traffic, and links the theory of identifiability (i.e., obtaining a unique parameter estimation result using sensor measurements) to the generalizability of behavior cloning (BC) and policy-based deep reinforcement learning (DRL). Correspondingly, the study shows theoretically that: (i) stop-and-go traffic can enable the property of identifiability and enhance the control performance of BC-based LCC in different traffic conditions; (ii) stop-and-go traffic is not necessary for DRL-based LCC to generalize to different traffic conditions; (iii) DRL-based LCC trained with only constant-speed lead vehicle trajectories (not sufficient to ensure identifiability) can be generalized to different traffic conditions; and (iv) stop-and-go traffic increases variance in the training dataset, which improves the convergence of parameter estimation while negatively impacting the convergence of DRL to the optimal control policy. Numerical experiments validate the above findings, illustrating that BC-based LCC entails comprehensive training datasets for generalizing to different traffic conditions, while DRL-based LCC can achieve generalization with simple free-flow traffic training environments. This further suggests DRL as a more promising and cost-effective LCC approach to reduce operational costs, mitigate traffic congestion, and enhance safety and mobility, which can accelerate the deployment and acceptance of CAVs.

Original languageEnglish
Article number104578
JournalTransportation Research Part C: Emerging Technologies
DOIs
StateAccepted/In press - 2024

Keywords

  • Behavior cloning
  • Car-following control
  • Deep reinforcement learning
  • Generalizability
  • System identification

Fingerprint

Dive into the research topics of 'Implications of stop-and-go traffic on training learning-based car-following control,☆'. Together they form a unique fingerprint.

Cite this