Abstract
Using the data from loop detector sensors for near-real-time detection of traffic incidents on highways is crucial to averting major traffic congestion. While recent supervised machine learning methods offer solutions to incident detection by leveraging human-labeled incident data, the false alarm rate is often too high to be used in practice. Specifically, the inconsistency in the human labeling of the incidents significantly affects the performance of supervised learning models. To that end, we focus on a data-centric approach to improve the accuracy and reduce the false alarm rate of traffic incident detection on highways. We develop a weak supervised learning workflow to generate high-quality training labels for the incident data without the ground truth labels, and we use those generated labels in the supervised learning setup for final detection. This approach comprises three stages. First, we introduce a data preprocessing and curation pipeline that processes traffic sensor data to generate high-quality training data through leveraging labeling functions, which can be domain knowledge-related or simple heuristic rules. Second, we evaluate the training data generated by weak supervision using three supervised learning models—random forest, k-nearest neighbors, and a support vector machine ensemble—and long short-term memory classifiers. The results show that the accuracy of all of the models improves significantly after using the training data generated by weak supervision. Third, we develop an online real-time incident detection approach that leverages the model ensemble and the uncertainty quantification while detecting incidents. Overall, we show that our proposed weak supervised learning workflow achieves a high incident detection rate (0.90) and low false alarm rate (0.08).
Original language | English |
---|---|
Article number | 106779 |
Journal | Accident Analysis and Prevention |
Volume | 176 |
DOIs | |
State | Published - Oct 2022 |
Externally published | Yes |
Funding
This material is based in part upon work supported by the U.S. Department of Energy , Office of Science, United States of America , under contract DE-AC02-06CH11357 . This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility under contract DE-AC02-06CH11357. This report and the work were sponsored by the U.S. Department of Energy (DOE) Vehicle Technologies Office (VTO) under the Big Data Solutions for Mobility Program, an initiative of the Energy Efficient Mobility Systems (EEMS) Program. David Anderson and Prasad Gupte, the DOE Office of Energy Efficiency and Renewable Energy (EERE) managers played important roles in establishing the project concept, advancing implementation, and providing ongoing guidance. All authors approved the version of the manuscript to be published. This material is based in part upon work supported by the U.S. Department of Energy, Office of Science, United States of America, under contract DE-AC02-06CH11357. This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility under contract DE-AC02-06CH11357. This report and the work were sponsored by the U.S. Department of Energy (DOE) Vehicle Technologies Office (VTO) under the Big Data Solutions for Mobility Program, an initiative of the Energy Efficient Mobility Systems (EEMS) Program. David Anderson and Prasad Gupte, the DOE Office of Energy Efficiency and Renewable Energy (EERE) managers played important roles in establishing the project concept, advancing implementation, and providing ongoing guidance. All authors approved the version of the manuscript to be published.
Funders | Funder number |
---|---|
U.S. Department of Energy | |
Office of Science | DE-AC02-06CH11357 |
Office of Energy Efficiency and Renewable Energy |
Keywords
- Data-centric machine learning
- Recurrent neural network
- Traffic incident detection
- Weak supervision