Covirt: Lightweight fault isolation and resource protection for co-kernels

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

The challenges of the exascale era have generated a number of advancements in HPC systems software, with co-kernel architectures emerging as one such novel approach for HPC operating system and runtime (OS/R) design. Cokernels function by running multiple specialized, lightweight OS kernels natively on the same host as a general purpose OS/R. These specialized kernels are able to provide optimized OS/R environments for HPC applications while still retaining access to the full feature set of the co-running general purpose OS/R. While co-kernels are able to effectively optimize for performance, they generally lack effective mechanisms for cross OS/R fault isolation and resource protection. In this paper we present Covirt, a lightweight OS/R protection layer that leverages the hardware virtualization features found on modern CPUs. Covirt interposes a minimal hypervisor layer between a co-kernel OS/R and hardware to prevent OS level faults from impacting other OS/Rs running on the same system. Covirt is different from other virtualization-based approaches due to the level of integration necessary between the co-kernel instances, requiring the support of higher level semantic interfaces between the different OS/Rs. Covirt features a split architecture consisting of a hypervisor and controller module that continuously monitors changes to the underlying resource partitioning and translates those events to hypervisor configuration changes. We have implemented a prototype of Covirt in the context of the Hobbes exascale OS/R stack, specifically targeting the Pisces co-kernel framework and Kitten Lightweight Kernel. Our evaluation shows that Covirt is able to add fault isolation for memory and interrupt processing with minimal performance overheads.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE 35th International Parallel and Distributed Processing Symposium, IPDPS 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages310-319
Number of pages10
ISBN (Electronic)9781665440660
DOIs
StatePublished - May 2021
Externally publishedYes
Event35th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2021 - Virtual, Online
Duration: May 17 2021May 21 2021

Publication series

NameProceedings - 2021 IEEE 35th International Parallel and Distributed Processing Symposium, IPDPS 2021

Conference

Conference35th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2021
CityVirtual, Online
Period05/17/2105/21/21

Funding

This material is based upon work supported by the National Science Foundation under Grant No. 1718287

Keywords

  • Cokernels
  • Hardware virtualization
  • Virtualization

Fingerprint

Dive into the research topics of 'Covirt: Lightweight fault isolation and resource protection for co-kernels'. Together they form a unique fingerprint.

Cite this