Abstract
Non-negative matrix factorization (Nmf) is an important tool in high-performance large scale data analytics with applications ranging from community detection, recommender system, feature detection and linear and non-linear unmixing. While traditional Nmf works well when the data set is relatively dense, however, it may not extract sufficient structure when the data is extremely sparse. Specifically, traditional Nmf fails to exploit the structured sparsity of the large and sparse data sets resulting in dense factors. We propose a new algorithm for performing Nmf on sparse data that we call multifrontal Nmf (Mf-Nmf) since it borrows several ideas from the multifrontal method for unconstrained factorization (e.g. LU and QR). We also present an efficient shared memory parallel implementation of Mf-Nmf and discuss its performance and scalability. We conduct several experiments on synthetic and realworld datasets and demonstrate the usefulness of the algorithm by comparing it against standard baselines. We obtain a speedup of 1.2x to 19.5x on 24 cores with an average speed up of 10.3x across all the real world datasets.
Original language | English |
---|---|
Title of host publication | Parallel Processing and Applied Mathematics - 13th International Conference, PPAM 2019, Revised Selected Papers |
Editors | Roman Wyrzykowski, Konrad Karczewski, Ewa Deelman, Jack Dongarra |
Publisher | Springer |
Pages | 543-554 |
Number of pages | 12 |
ISBN (Print) | 9783030432287 |
DOIs | |
State | Published - 2020 |
Event | 13th International Conference on Parallel Processing and Applied Mathematics, PPAM 2019 - Bialystok, Poland Duration: Sep 8 2019 → Sep 11 2019 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 12043 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 13th International Conference on Parallel Processing and Applied Mathematics, PPAM 2019 |
---|---|
Country/Territory | Poland |
City | Bialystok |
Period | 09/8/19 → 09/11/19 |
Funding
This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. This research was funded by Oak Ridge National Laboratory-Laboratory Directed Research and Development (LDRD) and used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/ downloads/doe-public-access-plan). This manuscript has been authored by UT-Battelle, LLC under Contract No. DEAC05-00OR22725 with the U.S. Department of Energy. This research was funded by Oak Ridge National Laboratory-Laboratory Directed Research and Development (LDRD) and used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/ downloads/doe-public-access-plan).
Keywords
- Data analysis
- Multifrontal methods
- Non-negative matrix factorization
- Sparse matrix computations