High-throughput computing on high-performance platforms: A case study

Danila Oleynik, Sergey Panitkin, Matteo Turilli, Alessio Angius, Sarp Oral, Kaushik De, Alexei Klimentov, Jack C. Wells, Shantenu Jha

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Scopus citations

Abstract

The computing systems used by LHC experiments has historically consisted of the federation of hundreds to thousands of distributed resources, ranging from small to mid-size re-source. In spite of the impressive scale of the existing distributed computing solutions, the federation of small to mid-size resources will be insufficient to meet projected future demands. This paper is a case study of how the ATLAS experiment has embraced Titan - a DOE leadership facility in conjunction with traditional distributed high-throughput computing to reach sustained production scales of approximately 52M core-hours a years. The three main contributions of this paper are: (i) a critical evaluation of design and operational considerations to support the sustained, scalable and production usage of Titan; (ii) a preliminary characterization of a next generation executor for PanDA to support new workloads and advanced execution modes; and (iii) early lessons for how current and future experimental and observational systems can be integrated with production supercomputers and other platforms in a general and extensible manner.

Original languageEnglish
Title of host publicationProceedings - 13th IEEE International Conference on eScience, eScience 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages295-304
Number of pages10
ISBN (Electronic)9781538626863
DOIs
StatePublished - Nov 14 2017
Event13th IEEE International Conference on eScience, eScience 2017 - Auckland, New Zealand
Duration: Oct 24 2017Oct 27 2017

Publication series

NameProceedings - 13th IEEE International Conference on eScience, eScience 2017

Conference

Conference13th IEEE International Conference on eScience, eScience 2017
Country/TerritoryNew Zealand
CityAuckland
Period10/24/1710/27/17

Keywords

  • high-performance and throughput computing

Fingerprint

Dive into the research topics of 'High-throughput computing on high-performance platforms: A case study'. Together they form a unique fingerprint.

Cite this