DICER: Data Intensive Computing Environment and Runtime for Evaluating Unprecedented Scale of Geospatial-Temporal Human Mobility Data

Debraj De, Gautam Malviya Thakur, Jesse McGaha, Chance Brown, Xiuling Nie, Todd Thomas, James D. Gaboardi, Kevin Sparks, Annetta Burger, Elizabeth C. McBride, Joon Seok Kim, Licia Amichi, Chathika Gunaratne, Carter Christopher, Dan Zubko

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

With the significant increase in sources and volume of human mobility data through commercial data vendors as well as microsimulation of cities, the scale of geospatial-temporal data to analyze and assess for mobility characterization has grown to the level of Big Data. There are mobility related commercial organizations deploying scalable computing, but often the system architecture, workflow, and intermediate processing components are not fully disclosed in relevant scope. Current research literature has a notable lack of studies demonstrating architectures and workflows for human mobility analytics that are implemented on a TeraByte scale of geospatial-temporal data. In this context, this paper presents a hyperscale-level system solution named DICER (Data Intensive Computing Environment and Runtime) for processing and analytics of geospatial-temporal data at big data scale. Although the cluster computing architecture of DICER with Apache Spark job running on Kubernetes cluster is not new, there are innovations in the workflow, hierarchical processing logic, and a wide range of intermediate preprocessing and mobility metrics calculation. We have performed case studies to validate the effectiveness of DICER system solution by performing detailed analytics and assessment of human mobility microsimulation output at three different scopes and scale, including a usecase with 16.97 TeraByte and 259.2 Billion rows of data. In addition, we have presented another case study of utilizing DICER to perform the same mobility processing and comparative analytics on large-scale commercially available geospatial-temporal data. All these case studies validate the efficiency and usefulness of DICER in computing population mobility characteristics from geospatial-temporal trajectory data at an unprecedented scale (not only just data volume, but also combination of: number of user entities, temporal frequency, spatial resolution, data duration).

Original languageEnglish
Title of host publicationProceedings - 2024 25th IEEE International Conference on Mobile Data Management, MDM 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages139-148
Number of pages10
ISBN (Electronic)9798350374551
DOIs
StatePublished - 2024
Event25th IEEE International Conference on Mobile Data Management, MDM 2024 - Brussels, Belgium
Duration: Jun 24 2024Jun 27 2024

Publication series

NameProceedings - IEEE International Conference on Mobile Data Management
ISSN (Print)1551-6245

Conference

Conference25th IEEE International Conference on Mobile Data Management, MDM 2024
Country/TerritoryBelgium
CityBrussels
Period06/24/2406/27/24

Funding

This work is supported by the Intelligence Advanced Research Projects Activ- ity (IARPA). The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DOE, or the U.S. Government. This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irre- vocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accor- dance with the DOE Public Access Plan (https://energy.gov/downloads/doe- public-access-plan).

FundersFunder number
Intelligence Advanced Research Projects Activity
DOE Public Access Plan
U.S. Department of EnergyDE-AC05-00OR22725

    Keywords

    • Geospatial temporal data
    • big data
    • big data analytics
    • cloud computing
    • cluster computing
    • human mobility
    • mobility metrics
    • test and evaluation (T&E)

    Fingerprint

    Dive into the research topics of 'DICER: Data Intensive Computing Environment and Runtime for Evaluating Unprecedented Scale of Geospatial-Temporal Human Mobility Data'. Together they form a unique fingerprint.

    Cite this