Performance implications from sizing a VM on multi-core systems: A data analytic application's view

Seung Hwan Lim, James Horey, Yanjun Yao, Edmon Begoli, Qing Cao

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In this paper, we present a quantitative performance analysis of data analytics applications running on multi-core virtual machines. Such environments form the core of cloud computing. In addition, data analytics applications, such as Cassandra and Hadoop, are becoming increasingly popular on cloud computing platforms. This convergence necessitates a better understanding of the performance and cost implications of such hybrid systems. For example, the very first step in hosting applications in virtualized environments, requires the user to configure the number of virtual processors and the size of memory. To understand performance implications of this step, we benchmarked three Yahoo Cloud Serving Benchmark(YCSB) workloads in a virtualized multi-core environment. Our measurements indicate that the performance of Cassandra for YCSB workloads does not heavily depend on the processing capacity of a system, while the size of the data set is critical to performance relative to allocated memory. We also identified a strong relationship between the running time of workloads and various hardware events (last level cache loads, misses, and CPU migrations). From this analysis, we provide several suggestions to improve the performance of data analytics applications running on cloud computing environments.

Original languageEnglish
Title of host publicationProceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013
PublisherIEEE Computer Society
Pages1001-1008
Number of pages8
ISBN (Print)9780769549798
DOIs
StatePublished - 2013
Event2013 IEEE 37th Annual Computer Software and Applications Conference, COMPSAC 2013 - Boston, MA, Japan
Duration: Jul 22 2013Jul 26 2013

Publication series

NameProceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013

Conference

Conference2013 IEEE 37th Annual Computer Software and Applications Conference, COMPSAC 2013
Country/TerritoryJapan
CityBoston, MA
Period07/22/1307/26/13

Fingerprint

Dive into the research topics of 'Performance implications from sizing a VM on multi-core systems: A data analytic application's view'. Together they form a unique fingerprint.

Cite this