Analyzing the IT subsystem failure impact on availability of cloud services

  • Guto Leoni Santos
  • , Patricia Takako Endo
  • , Glauco Goncalves
  • , Daniel Rosendo
  • , Demis Gomes
  • , Judith Kelner
  • , Djamel Sadok
  • , Mozhgan Mahloo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

29 Scopus citations

Abstract

Cloud computing has gained popularity in recent years due to its pay-as-you-go business model, high availability of services, and scalability. Service unavailability does not affect just user experience but is also translated into direct costs for cloud providers and companies. Part of this costs is due to SLA breaches, once interruption time greater than those signed in the contract generate financial penalties. Thus, cloud providers have tried to identify failure points and estimate the availability of their services. This paper proposes models to assess the availability of services running in a cloud data center infrastructure. The models follow the TIA-942 standard. We propose Tier I and IV models using the Reliability Block Diagram (RBD) to allow modeling of different types of applications, and Stochastic Petri Net (SPN) to represent the failure behavior of information technology (IT) components in a data center. We perform stationary analysis to measure the service availability, and sensitivity analysis to understand which metrics have major impacts on data center availability.

Original languageEnglish
Title of host publication2017 IEEE Symposium on Computers and Communications, ISCC 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages717-723
Number of pages7
ISBN (Electronic)9781538616291
DOIs
StatePublished - Sep 1 2017
Externally publishedYes
Event2017 IEEE Symposium on Computers and Communications, ISCC 2017 - Heraklion, Greece
Duration: Jul 3 2017Jul 7 2017

Publication series

NameProceedings - IEEE Symposium on Computers and Communications
ISSN (Print)1530-1346

Conference

Conference2017 IEEE Symposium on Computers and Communications, ISCC 2017
Country/TerritoryGreece
CityHeraklion
Period07/3/1707/7/17

Funding

ACKNOWLEDGMENT This work was supported by the RLAM Innovation Center, Ericsson Telecomunicac¸ões S.A., Brazil. We would like to thank the design team of the Networking and Telecommunication Research Group (GPRT) for the support.

Fingerprint

Dive into the research topics of 'Analyzing the IT subsystem failure impact on availability of cloud services'. Together they form a unique fingerprint.

Cite this