TY - JOUR
T1 - Missing data imputation for traffic flow speed using spatio-temporal cokriging
AU - Bae, Bumjoon
AU - Kim, Hyun
AU - Lim, Hyeonsup
AU - Liu, Yuandong
AU - Han, Lee D.
AU - Freeze, Phillip B.
N1 - Publisher Copyright:
© 2018 Elsevier Ltd
PY - 2018/3
Y1 - 2018/3
N2 - Modern transportation systems rely increasingly on the availability and accuracy of traffic detector data to monitor traffic operational conditions and assess system performance. Missing data, which occurs almost inevitably for a number of reasons, can lead to suboptimal operations and ineffective decisions if not remedied in a timely and systematic fashion through data imputation. A review of literature suggests that most traffic data imputation studies considered the temporal continuity of the data but often overlooked the spatial correlations that exist. Few of the studies explored the randomness of the patterns of the missing data. Therefore, this paper proposes two cokriging methods that exploit the existence of spatio-temporal dependency in traffic data and employ multiple data sources, each with independently missing data, to impute high-resolution traffic speed data under different data missing pattern scenarios. The two proposed cokriging methods, both using multiple independent data sources, were benchmarked against classic simple and ordinary kriging methods, which use only the primary data source. An array of testing scenarios were designed to test these methods under different missing rates (10–40% data loss) and different missing patterns (random in time and location, random only in location, and non-random blocks of missing data). The results suggest that using multiple data sources with the spatio-temporal simple cokriging method effectively improves the imputation accuracy if the missing data were clustered, or in blocks. On the other hand, if the missing data were randomly scattered in time and location, the classic ordinary or simple kriging method using only the primary data source can be more effective. Our study, which employs empirical traffic speed data from radar detectors and vehicle probes, demonstrates that the overall predictions of the kriging-based imputation approach are accurate and reliable for all combinations of missing patterns and missing rates investigated.
AB - Modern transportation systems rely increasingly on the availability and accuracy of traffic detector data to monitor traffic operational conditions and assess system performance. Missing data, which occurs almost inevitably for a number of reasons, can lead to suboptimal operations and ineffective decisions if not remedied in a timely and systematic fashion through data imputation. A review of literature suggests that most traffic data imputation studies considered the temporal continuity of the data but often overlooked the spatial correlations that exist. Few of the studies explored the randomness of the patterns of the missing data. Therefore, this paper proposes two cokriging methods that exploit the existence of spatio-temporal dependency in traffic data and employ multiple data sources, each with independently missing data, to impute high-resolution traffic speed data under different data missing pattern scenarios. The two proposed cokriging methods, both using multiple independent data sources, were benchmarked against classic simple and ordinary kriging methods, which use only the primary data source. An array of testing scenarios were designed to test these methods under different missing rates (10–40% data loss) and different missing patterns (random in time and location, random only in location, and non-random blocks of missing data). The results suggest that using multiple data sources with the spatio-temporal simple cokriging method effectively improves the imputation accuracy if the missing data were clustered, or in blocks. On the other hand, if the missing data were randomly scattered in time and location, the classic ordinary or simple kriging method using only the primary data source can be more effective. Our study, which employs empirical traffic speed data from radar detectors and vehicle probes, demonstrates that the overall predictions of the kriging-based imputation approach are accurate and reliable for all combinations of missing patterns and missing rates investigated.
KW - Cokriging
KW - Imputation
KW - Missing data
KW - Missing patterns
KW - Spatio-temporal kriging
UR - http://www.scopus.com/inward/record.url?scp=85044645367&partnerID=8YFLogxK
U2 - 10.1016/j.trc.2018.01.015
DO - 10.1016/j.trc.2018.01.015
M3 - Article
AN - SCOPUS:85044645367
SN - 0968-090X
VL - 88
SP - 124
EP - 139
JO - Transportation Research Part C: Emerging Technologies
JF - Transportation Research Part C: Emerging Technologies
ER -