Integrating Intel Xeon Phi Coprocessors into a Cluster Environment

Paul Peltz, Troy Baer, Ryan Braby, Vince Betro, Glenn Brook, Karl Schulz

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

New technologies have a disruptive force in high-performance computing (HPC). Mostly, these technologies have benefited the HPC community, and the Intel® Xeon Phi™ coprocessor is no exception. With these new technologies brings new challenges and new system and software design paradigms. The challenge for systems administrators is to integrate these new technologies into the ecosystem of their infrastructure. The Intel Xeon Phi coprocessors are quickly becoming a leading choice among researchers as evidenced by the growing number of supercomputers that utilize these devices. However, integrating this coprocessor can be a real challenge because it introduces an embedded operating system into established supercomputer ecosystems. Not only do systems administrators have to provision and stage nodes for use in their clusters, they also have to determine a way to provision the coprocessors' environments so they are not disruptive to what the users were accustomed to on other systems. This chapter outlines some of the early history and motivations for the deployment of the Beacon cluster at the National Institute of Computational Sciences in Oak Ridge, Tennessee along with some "pearls" of wisdom that we have learned along the way in order to integrate the coprocessor seamlessly into the cluster.

Original languageEnglish
Title of host publicationHigh Performance Parallelism Pearls
Subtitle of host publicationMulticore and Many-core Programming Approaches
PublisherElsevier Inc.
Pages255-276
Number of pages22
ISBN (Electronic)9780128021996
ISBN (Print)9780128021187
DOIs
StatePublished - 2015

Funding

The University of Tennessee and the Oak Ridge National Laboratory jointly operate the Joint Institute for Computational Sciences (JICS). In 2007 JICS, through the National Science Foundation (NSF) award for the Kraken project, established National Institute of Computational Sciences (NICS). In early 2011, NICS and Intel embarked on a multiyear strategic engagement to pursue the development of next-generation, high-performance computing (HPC) solutions based on the Intel® Many Integrated Core (Intel® MIC) architecture, now branded as the Intel® Xeon Phi™ product family. NICS received early access to Intel MIC technologies in return for application testing, performance results, and expert feedback to help guide ongoing development efforts at Intel. The resulting collaboration allowed the Application Acceleration Center of Excellence within JICS to explore software performance using several preproduction codename Knights Ferry (KNF) software development platforms that were networked together to produce one of the first clusters equipped with Intel MIC technology outside of Intel. Shortly thereafter, NICS and Intel joined with Cray to deploy a two-node Cray CX1 “cluster- in-a-box” supercomputer equipped with one KNF coprocessor per node and Appro, a computer systems manufacturer, to deploy a four-node conventional cluster with two KNF coprocessors per node. This material is based upon work supported by the National Science Foundation under Grant Number 1137097 and by the University of Tennessee through the Beacon Project. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or the University of Tennessee.

Keywords

  • Beacon
  • Best practice
  • Cluster
  • Coprocessor
  • HPC
  • JICS
  • MIC
  • NICS
  • System administration
  • TORQUE
  • Xeon Phi

Fingerprint

Dive into the research topics of 'Integrating Intel Xeon Phi Coprocessors into a Cluster Environment'. Together they form a unique fingerprint.

Cite this