A Scalable Graph Analytics Framework for Programming with Big Data in R (pbdR)

S. M. Shamimul Hasan, Drew Schmidt, Ramakrishnan Kannan, Neena Imam

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Many disciplines such as biology, economics, engineering, physics, and the social sciences represent their data as graphs to capture patterns, trends, and associations. There are are many commercially available graph libraries in different programming languages to analyze these complex graphs. But there is no distributed graph library package in R - the popular statistical programming language to analyze graphs that bigger than a single machine's memory. Many domain experts prefer R over the numerous other alternatives. Towards this, we present a distributed graph analytics framework for R called programming with big graph using R (pBGR.) Our proposed framework leverages the Programming with Big Data in R (pbdR) ecosystem that provides scalable R packages for distributed computing in data science. We present an early prototype implementation of this framework using the distributed-memory parallel graph library CombBLAS and evaluate the framework's performance on leadership class computing platforms. Our experimental results demonstrate that the proposed framework is capable of performing large-scale parallel graph mining through the easyto-use R language. This enhanced graph processing capability coupled with other statistical tools already available in R, should be valuable to many domain experts.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019
EditorsChaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4783-4792
Number of pages10
ISBN (Electronic)9781728108582
DOIs
StatePublished - Dec 2019
Event2019 IEEE International Conference on Big Data, Big Data 2019 - Los Angeles, United States
Duration: Dec 9 2019Dec 12 2019

Publication series

NameProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019

Conference

Conference2019 IEEE International Conference on Big Data, Big Data 2019
Country/TerritoryUnited States
CityLos Angeles
Period12/9/1912/12/19

Funding

Support for this work was provided by the United States Department of Defense. ACKNOWLEDGEMENTS Support for this work was provided by the United States Department of Defense. We used resources of the Computational Research and Development Programs and the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

FundersFunder number
US Department of Energy
U.S. Department of Defense
U.S. Department of Energy
Office of Science

    Keywords

    • CombBLAS
    • R
    • Titan
    • pBGR
    • pbdR

    Fingerprint

    Dive into the research topics of 'A Scalable Graph Analytics Framework for Programming with Big Data in R (pbdR)'. Together they form a unique fingerprint.

    Cite this