Abstract
Previous studies have revealed that paravirtualization imposes minimal performance overhead on High Performance Computing (HPC) workloads, while exposing numerous benefits for this field. In this study, we are investigating the impact of paravirtualization on the performance of automatically-tuned software systems. We compare peak performance, performance degradation in constrained memory situations, performance degradation in multi-threaded applications, and inter-VM shared memory performance. For comparison purposes, we examine the proficiency of ATLAS, a quintessential example of an autotuning software system, in tuning the BLAS library routines for paravirtualized systems. Our results show that the combination of ATLAS and Xen paravirtualization delivers native execution performance and nearly identical memory hierarchy performance profiles in both single and multi-threaded scenarios. Furthermore, we show that it is possible to achieve memory sharing among OS instances at native speeds. These results expose new benefits to memory-intensive applications arising from the ability to slim down the guest OS without influencing the system performance. In addition, our findings support a novel and very attractive deployment scenario for computational science and engineering codes on virtual clusters and computational clouds.
Original language | English |
---|---|
Pages (from-to) | 101-122 |
Number of pages | 22 |
Journal | Cluster Computing |
Volume | 12 |
Issue number | 2 SPEC. ISS. |
DOIs | |
State | Published - 2009 |
Externally published | Yes |
Funding
This work is sponsored in part by NSF grants (ST-HEC-0444412 and CCF-0331645).
Funders | Funder number |
---|---|
National Science Foundation | ST-HEC-0444412, CCF-0331645 |
Keywords
- AutoTuning
- BLAS
- Cloud computing
- High performance
- Linear algebra
- Paravirtualization
- Virtual machine monitors