Improving the accuracy of freight mode choice models: A case study using the 2017 CFS PUF data set and ensemble learning techniques

Diyi Liu, Hyeonsup Lim, Majbah Uddin, Yuandong Liu, Lee D. Han, Ho ling Hwang, Shih Miao Chin

Research output: Contribution to journalArticlepeer-review

Abstract

The US Census Bureau has collected two rounds of experimental data from the Commodity Flow Survey, providing shipment-level characteristics of nationwide commodity movements, published in 2012 (i.e., Public Use Microdata) and in 2017 (i.e., Public Use File). With this information, data-driven methods have become increasingly valuable for understanding detailed patterns in freight logistics. In this study, we used the 2017 Commodity Flow Survey Public Use File data set to explore building a high-performance freight mode choice model, considering three main improvements: (1) constructing local models for each separate commodity/industry category; (2) extracting useful geographical features, particularly the derived distance of each freight mode between origin/destination zones; and (3) applying additional ensemble learning methods such as stacking or voting to combine results from local and unified models for improved performance. The proposed method achieved over 92% accuracy without incorporating external information, an over 19% increase compared to directly fitting Random Forests models over 10,000 samples. Furthermore, SHAP (Shapely Additive Explanations) values were computed to explain the outputs and major patterns obtained from the proposed model. The model framework could enhance the performance and interpretability of existing freight mode choice models.

Original languageEnglish
Article number122478
JournalExpert Systems with Applications
Volume240
DOIs
StatePublished - Apr 15 2024

Funding

Notice: This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan. The authors appreciate the funding and research opportunities provided by the Graduate Advancement Training and Education for collaborate research between the University of Tennessee, Knoxville, and the US Department of Energy’s Oak Ridge National Laboratory. Statement: During the preparation of this work, the authors used gpt-3.5-turbo to improve the fluency and clarity of the writing. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication. Acknowledgement: The authors appreciate the funding and research opportunities provided by the Graduate Advancement Training and Education for collaborate research between the University of Tennessee, Knoxville and the Oak Ridge National Laboratory.

Keywords

  • Behavior analysis
  • Commodity Flow Survey
  • Freight mode choice
  • Interpretable machine learning
  • Machine learning
  • Stacking method

Fingerprint

Dive into the research topics of 'Improving the accuracy of freight mode choice models: A case study using the 2017 CFS PUF data set and ensemble learning techniques'. Together they form a unique fingerprint.

Cite this