Parallel Hybrid Metaheuristics with Distributed Intensification and Diversification for Large-scale Optimization in Big Data Statistical Analysis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

Important insights into many data science problems that are traditionally analyzed via statistical models can be obtained by re-formulating and evaluating within a large-scale optimization framework. However, the theoretical underpinnings of the statistical model may shift the goal of the decision space traversal from a traditional search for a single optimal solution to a traversal with the purpose of yielding a set of high quality, independent solutions. We examine statistical frameworks with astronomical decision spaces that translate to optimization problem but are challenging for standard optimization methodologies. We address the new challenges by designing a hybrid metaheuristic with specialized intensification and diversification protocols in the base search algorithm. Our algorithm is extended to the high performance computing realm using the Stampede2 supercomputer where we experimentally demonstrate the effectiveness of our algorithm to utilize multiple processors to collaboratively hill climb, broadcast messages to one another regarding landscape characteristics, diversify across the solution landscape, and request aid in climbing particularly difficult peaks.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019
EditorsChaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3312-3320
Number of pages9
ISBN (Electronic)9781728108582
DOIs
StatePublished - Dec 2019
Event2019 IEEE International Conference on Big Data, Big Data 2019 - Los Angeles, United States
Duration: Dec 9 2019Dec 12 2019

Publication series

NameProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019

Conference

Conference2019 IEEE International Conference on Big Data, Big Data 2019
Country/TerritoryUnited States
CityLos Angeles
Period12/9/1912/12/19

Funding

Yan Y. Liu’s work in this paper is partly supported by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the US Department of Energy under contract DE-AC05-00OR22725. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paidup, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan. Both authors contributed equally to this project. 978-1-7281-0858-219$31.00 ©2019 IEEE. The experiments conducted in this paper used the Extreme Science and Engineering Discovery Environment (XSEDE) resources, which are supported by National Science Foundation grant number ACI-1548562. Specifically, the authors acknowledge the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for providing HPC resources, i.e., the Stampede2 system, that have contributed to the research results reported within this paper.

Keywords

  • Causal Inference
  • Diversification and Intensification
  • Optimization
  • Statistics

Fingerprint

Dive into the research topics of 'Parallel Hybrid Metaheuristics with Distributed Intensification and Diversification for Large-scale Optimization in Big Data Statistical Analysis'. Together they form a unique fingerprint.

Cite this