Race conditions and data partitioning: risks posed by common errors to reproducible parallel simulations

James Nutaro, Ozgur Ozmen

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

When parallel algorithms for simulation were introduced in the 1970s, their development and use interested only experts in parallel computation. This circumstance changed as multi-core processors became commonplace, putting a parallel computer into the hands of every modeler. A natural outcome is growing interest in parallel simulation among persons not intimately familiar with parallel computing. At the same time, parallel simulation tools continue to be developed with the implicit assumption that the modeler is knowledgeable about parallel programming. The unintended consequence is a rapidly growing number of users of parallel simulation tools that are unlikely to recognize when the interaction of race conditions, partitioning strategies, and simultaneous action in their simulation models make results non-reproducible, thereby calling into question the validity of conclusions drawn from the simulation data. We illustrate the potential dangers of exposing parallel algorithms to users who are not experts in parallel computation with example models constructed using existing parallel simulation tools. By doing so, we hope to refocus tool developers on usability, even if this new focus incurs loss of some performance.

Original languageEnglish
Pages (from-to)417-427
Number of pages11
JournalSIMULATION
Volume99
Issue number4
DOIs
StatePublished - Apr 2023

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Research sponsored by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory (ORNL), managed by UT-Battelle, LLC for the U. S. Department of Energy under Contract No. DE-AC05-00OR22725. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of the manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). Research sponsored by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory (ORNL), managed by UT-Battelle, LLC for the U. S. Department of Energy under Contract No. DE-AC05-00OR22725. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of the manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan ( http://energy.gov/downloads/doe-public-access-plan ).

FundersFunder number
DOE Public Access Plan
United States Government
U.S. Department of EnergyDE-AC05-00OR22725
Oak Ridge National Laboratory
UT-Battelle

    Keywords

    • Parallel simulation
    • agent based
    • discrete event
    • reproducibility

    Fingerprint

    Dive into the research topics of 'Race conditions and data partitioning: risks posed by common errors to reproducible parallel simulations'. Together they form a unique fingerprint.

    Cite this