GENASIS Basics: Object-oriented utilitarian functionality for large-scale physics simulations (Version 4)

Research output: Contribution to journalArticlepeer-review

Abstract

GENASIS Basics provides modern Fortran classes furnishing extensible object-oriented utilitarian functionality for large-scale physics simulations on distributed memory supercomputers. This functionality includes physical units and constants; display to the screen or standard output device; message passing; I/O to disk; and runtime parameter management and usage statistics. This revision—Version 4 of Basics—includes a name change and additions to functionality, including the facilitation of direct communication between GPUs. New version program summary: Program title: SineWaveAdvection, SawtoothWaveAdvection, and RiemannProblem (fluid dynamics example problems illustrating GENASIS Basics); ArgonEquilibrium and ClusterFormation (molecular dynamics example problems illustrating GENASIS Basics) CPC Library link to program files: https://doi.org/10.17632/6w9ygpygmc.3 Developer's repository link: https://github.com/GenASiS Code Ocean capsule: https://codeocean.com/capsule/9737716 Licensing provisions: GPLv3 Programming language: Modern Fortran; OpenMP (tested with recent versions of GNU Compiler Collection (GCC), Cray Compiler Environment (CCE), IBM XL Fortran compiler) Journal reference of previous version: Comput. Phys. Commun. 244 (2019) 483 Does the new version supersede the previous version?: Yes Nature of problem: By way of illustrating GENASIS Basics functionality, solve example fluid dynamics and molecular dynamics problems. Solution method: For fluid dynamics examples, finite-volume. For molecular dynamics examples, leapfrog and velocity-Verlet integration. Reasons for new version: This version includes a significant name change, some minor additions to functionality, and two major additions to functionality: support for systems using AMD GPUs and infrastructure facilitating GPU-aware MPI communications. Summary of revisions: The CONSTANT singleton has been updated to 2022 values [1]. The class MeasuredValueForm—a class for handling numbers with labels to provide means of dealing with units—has been renamed QuantityForm. An AddCommand and MultiplyAddCommand have been added to the ArrayOperations division of the code. The Real_1D_Form and Real_3D_Form classes, used to construct “ragged arrays,” now have AllocateDevice ( ) methods to provide mirror allocation of GPU memory. Show_Command now has an option to allow the display of more digits for integer and real numbers. In the CurveImageForm and StructuredGridImageForm classes used for I/O, the SetGrid and SetReadAttributes methods have been replaced by SetGridWrite and SetGridRead respectively. An optional flag StorageOnlyOption of their Read methods provides streamlined data input that assumes the data being read conforms to the grid resolution and domain decomposition of the currently running program. In the PROGRAM_HEADER singleton, the method RecordStatistics replaces ShowStatistics for recording memory usage and timers. The recording of these data has been refactored and streamlined. Memory usage statistics are now available on macOS. In order to facilitate organic ordering of timer data corresponding the order they are encountered in the code (so as to avoid the necessity of hard-coded timer setup routines), the AddTimer method has been deleted, and the Timer method returns a pointer to either an existing instance of TimerForm or a new one, if it does not yet exist. WARNING: It is important to initialize timer handle variables to zero (or a negative value) in order for the code to recognize that a new timer needs to be created, and to avoid spurious handle values. GPU-aware MPI communications—passing GPU memory addresses directly to MPI routines—is now supported by the MessagePassing classes (see the original and Version 2 updates of this article for more detailed descriptions of the MessagePassing classes). A new method AllocateDevice ( ) has been added to these classes to activate this feature. When the communication buffers are allocated in an instantiation of the class, AllocateDevice ( ) creates a mirror allocation of the buffers on the GPU. When associations with pre-existing arrays are used as the communication buffers with the class instantiation, AllocateDevice ( ) deduces the GPU memory addresses associated with these buffers to be used for future MPI communications. The example fluid dynamics problem RiemannProblem included in this release has been modified to illustrate the use of GPU-aware MPI. In the DistributedMeshForm class, a call to the AllocateDevice ( ) method is made for the instances of the MessageIncoming_* and the MessageOutgoing_* classes when GPU offload is enabled (see [2] and Version 3 of this article for more detailed descriptions of GPU offload in GENASIS). The use of GPU-aware communication can be explicitly turned on or off using a command-line argument DevicesCommunicate=T or DevicesCommunicate=F, respectively. On the Summit supercomputer at the Oak Ridge Leadership Computing Facility (OLCF) [3], exploiting GPU-aware communications yields over 20% speedups for RiemannProblem due to the avoidance of explicit GPU-memory to CPU-memory data movement for MPI communications. The example program RiemannProblem has been modified such that the use of GPU offload can be controlled by a command-line argument UseDevice=[T,F] when the executable is built with OpenMP offload support. For example, on OLCF Summit, the following commands build and execute the three-dimensional RiemannProblem with 5123 cells three times with eight MPI processes. The first one, by default, uses GPU offload and GPU-aware communications. The second run uses MPI communications on the host. And finally the third run uses OpenMP threading on the CPU by disabling GPU offload, which also automatically disables GPU-aware communications.[Formula presented] Finally, this revision adds support for AMD GPUs and other accelerators supported by the HIP programming model [4] as provided by the new file Device_HIP.c under the directory Modules/ Basics/Devices. The functionalities provided here are either not currently available in OpenMP or not yet implemented widely, such as inquiry of GPU memory usage and allocation of host page-locked memory. The use of Device_CUDA.c and Device_HIP.c is mutually exclusive and controlled by the Makefile variables DEVICE_CUDA and DEVICE_HIP, respectively. An example of how this is done can be found in the machine Makefile Makefile_Cray_CCE. Additional comments including restrictions and unusual features: Uses the MPI [5] and Silo [6] libraries. The example problems named above are not ends in themselves, but serve to illustrate our object-oriented approach and the functionality available though GENASIS Basics. In addition to these more substantial examples, we provide individual unit test programs for the individual classes comprised by GENASIS Basics. GENASIS Basics is available in the CPC Program Library and also at https://github.com/GenASiS. References: [1] R.L. Workman et al., Particle Data Group, Prog. Theor. Exp. Phys. 2022 (2022) 083C01. [2] R.D. Budiardja, C.Y. Cardall, Parallel Comput. 88 (2019) 102544. [3] https://docs.olcf.ornl.gov/systems/summit_user_guide.html. [4] https://rocmdocs.amd.com/en/latest/Programming_Guides/Programming-Guides.html. [5] https://www.mpi-forum.org. [6] https://wci.llnl.gov/simulation/computer-codes/silo.

Original languageEnglish
Article number108505
JournalComputer Physics Communications
Volume281
DOIs
StatePublished - Dec 2022

Funding

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).This material is based upon work supported by the U.S. Department of Energy Office of Science, Office of Nuclear Physics under contract number DE-AC05-00OR22725 and the National Science Foundation under Grant No. 1535130. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. This material is based upon work supported by the U.S. Department of Energy Office of Science , Office of Nuclear Physics under contract number DE-AC05-00OR22725 and the National Science Foundation under Grant No. 1535130 . This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725 .

FundersFunder number
DOE Public Access Plan
United States Government
National Science Foundation1535130
U.S. Department of Energy
Office of Science
Nuclear PhysicsDE-AC05-00OR22725

    Keywords

    • GPU-accelerated simulation
    • Modern fortran
    • Object-oriented programming
    • OpenMP offload
    • Simulation framework

    Fingerprint

    Dive into the research topics of 'GENASIS Basics: Object-oriented utilitarian functionality for large-scale physics simulations (Version 4)'. Together they form a unique fingerprint.

    Cite this