TY - GEN
T1 - DICER
T2 - 25th IEEE International Conference on Mobile Data Management, MDM 2024
AU - De, Debraj
AU - Thakur, Gautam Malviya
AU - McGaha, Jesse
AU - Brown, Chance
AU - Nie, Xiuling
AU - Thomas, Todd
AU - Gaboardi, James D.
AU - Sparks, Kevin
AU - Burger, Annetta
AU - McBride, Elizabeth C.
AU - Kim, Joon Seok
AU - Amichi, Licia
AU - Gunaratne, Chathika
AU - Christopher, Carter
AU - Zubko, Dan
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - With the significant increase in sources and volume of human mobility data through commercial data vendors as well as microsimulation of cities, the scale of geospatial-temporal data to analyze and assess for mobility characterization has grown to the level of Big Data. There are mobility related commercial organizations deploying scalable computing, but often the system architecture, workflow, and intermediate processing components are not fully disclosed in relevant scope. Current research literature has a notable lack of studies demonstrating architectures and workflows for human mobility analytics that are implemented on a TeraByte scale of geospatial-temporal data. In this context, this paper presents a hyperscale-level system solution named DICER (Data Intensive Computing Environment and Runtime) for processing and analytics of geospatial-temporal data at big data scale. Although the cluster computing architecture of DICER with Apache Spark job running on Kubernetes cluster is not new, there are innovations in the workflow, hierarchical processing logic, and a wide range of intermediate preprocessing and mobility metrics calculation. We have performed case studies to validate the effectiveness of DICER system solution by performing detailed analytics and assessment of human mobility microsimulation output at three different scopes and scale, including a usecase with 16.97 TeraByte and 259.2 Billion rows of data. In addition, we have presented another case study of utilizing DICER to perform the same mobility processing and comparative analytics on large-scale commercially available geospatial-temporal data. All these case studies validate the efficiency and usefulness of DICER in computing population mobility characteristics from geospatial-temporal trajectory data at an unprecedented scale (not only just data volume, but also combination of: number of user entities, temporal frequency, spatial resolution, data duration).
AB - With the significant increase in sources and volume of human mobility data through commercial data vendors as well as microsimulation of cities, the scale of geospatial-temporal data to analyze and assess for mobility characterization has grown to the level of Big Data. There are mobility related commercial organizations deploying scalable computing, but often the system architecture, workflow, and intermediate processing components are not fully disclosed in relevant scope. Current research literature has a notable lack of studies demonstrating architectures and workflows for human mobility analytics that are implemented on a TeraByte scale of geospatial-temporal data. In this context, this paper presents a hyperscale-level system solution named DICER (Data Intensive Computing Environment and Runtime) for processing and analytics of geospatial-temporal data at big data scale. Although the cluster computing architecture of DICER with Apache Spark job running on Kubernetes cluster is not new, there are innovations in the workflow, hierarchical processing logic, and a wide range of intermediate preprocessing and mobility metrics calculation. We have performed case studies to validate the effectiveness of DICER system solution by performing detailed analytics and assessment of human mobility microsimulation output at three different scopes and scale, including a usecase with 16.97 TeraByte and 259.2 Billion rows of data. In addition, we have presented another case study of utilizing DICER to perform the same mobility processing and comparative analytics on large-scale commercially available geospatial-temporal data. All these case studies validate the efficiency and usefulness of DICER in computing population mobility characteristics from geospatial-temporal trajectory data at an unprecedented scale (not only just data volume, but also combination of: number of user entities, temporal frequency, spatial resolution, data duration).
KW - Geospatial temporal data
KW - big data
KW - big data analytics
KW - cloud computing
KW - cluster computing
KW - human mobility
KW - mobility metrics
KW - test and evaluation (T&E)
UR - http://www.scopus.com/inward/record.url?scp=85199602561&partnerID=8YFLogxK
U2 - 10.1109/MDM61037.2024.00037
DO - 10.1109/MDM61037.2024.00037
M3 - Conference contribution
AN - SCOPUS:85199602561
T3 - Proceedings - IEEE International Conference on Mobile Data Management
SP - 139
EP - 148
BT - Proceedings - 2024 25th IEEE International Conference on Mobile Data Management, MDM 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 24 June 2024 through 27 June 2024
ER -