TY - JOUR
T1 - CUMULVS
T2 - Providing fault tolerance, visualization, and steering of parallel applications
AU - Geist, G. A.
AU - Kohl, James Arthur
AU - Papadopoulos, Philip M.
PY - 1997
Y1 - 1997
N2 - The use of visualization and computational steering can often assist scientists in analyzing large-scale scientific applications. Fault tolerance to failures is of great importance when running on a distributed system. However, the details of implementing these features are complex and tedious, leaving many scientists with inadequate development tools. CUMULVS is a library that enables programmers to easily incorporate interactive visualization and computational steering into existing parallel programs. Built on the PVM virtual machine framework, CUMULVS is portable and interoperable with all the computer architectures that PVM works with - a growing list that now stands at about 60 architectures. The CUMULVS library is divided into two pieces: one for the application program and one for the possibly commercial, visualization, and steering front end. Together, these two libraries encompass all the connection and data protocols needed to dynamically attach multiple, independent viewer front ends to a running parallel application. Viewer programs can also steer one or more user-defined parameters to "close the loop" for computational experiments and analyses. CUMULVS allows the programmer to specify user-directed checkpoints for saving an important program state in case of failures and also provides a mechanism to migrate tasks across heterogeneous machine architectures to achieve improved performance. Details of the CUMULVS design goals and compromises as well as future directions are given.
AB - The use of visualization and computational steering can often assist scientists in analyzing large-scale scientific applications. Fault tolerance to failures is of great importance when running on a distributed system. However, the details of implementing these features are complex and tedious, leaving many scientists with inadequate development tools. CUMULVS is a library that enables programmers to easily incorporate interactive visualization and computational steering into existing parallel programs. Built on the PVM virtual machine framework, CUMULVS is portable and interoperable with all the computer architectures that PVM works with - a growing list that now stands at about 60 architectures. The CUMULVS library is divided into two pieces: one for the application program and one for the possibly commercial, visualization, and steering front end. Together, these two libraries encompass all the connection and data protocols needed to dynamically attach multiple, independent viewer front ends to a running parallel application. Viewer programs can also steer one or more user-defined parameters to "close the loop" for computational experiments and analyses. CUMULVS allows the programmer to specify user-directed checkpoints for saving an important program state in case of failures and also provides a mechanism to migrate tasks across heterogeneous machine architectures to achieve improved performance. Details of the CUMULVS design goals and compromises as well as future directions are given.
UR - http://www.scopus.com/inward/record.url?scp=0031221351&partnerID=8YFLogxK
U2 - 10.1177/109434209701100305
DO - 10.1177/109434209701100305
M3 - Article
AN - SCOPUS:0031221351
SN - 1094-3420
VL - 11
SP - 224
EP - 235
JO - International Journal of High Performance Computing Applications
JF - International Journal of High Performance Computing Applications
IS - 3
ER -