TY - GEN
T1 - Multifrontal non-negative matrix factorization
AU - Sao, Piyush
AU - Kannan, Ramakrishnan
N1 - Publisher Copyright:
© Springer Nature Switzerland AG 2020.
PY - 2020
Y1 - 2020
N2 - Non-negative matrix factorization (Nmf) is an important tool in high-performance large scale data analytics with applications ranging from community detection, recommender system, feature detection and linear and non-linear unmixing. While traditional Nmf works well when the data set is relatively dense, however, it may not extract sufficient structure when the data is extremely sparse. Specifically, traditional Nmf fails to exploit the structured sparsity of the large and sparse data sets resulting in dense factors. We propose a new algorithm for performing Nmf on sparse data that we call multifrontal Nmf (Mf-Nmf) since it borrows several ideas from the multifrontal method for unconstrained factorization (e.g. LU and QR). We also present an efficient shared memory parallel implementation of Mf-Nmf and discuss its performance and scalability. We conduct several experiments on synthetic and realworld datasets and demonstrate the usefulness of the algorithm by comparing it against standard baselines. We obtain a speedup of 1.2x to 19.5x on 24 cores with an average speed up of 10.3x across all the real world datasets.
AB - Non-negative matrix factorization (Nmf) is an important tool in high-performance large scale data analytics with applications ranging from community detection, recommender system, feature detection and linear and non-linear unmixing. While traditional Nmf works well when the data set is relatively dense, however, it may not extract sufficient structure when the data is extremely sparse. Specifically, traditional Nmf fails to exploit the structured sparsity of the large and sparse data sets resulting in dense factors. We propose a new algorithm for performing Nmf on sparse data that we call multifrontal Nmf (Mf-Nmf) since it borrows several ideas from the multifrontal method for unconstrained factorization (e.g. LU and QR). We also present an efficient shared memory parallel implementation of Mf-Nmf and discuss its performance and scalability. We conduct several experiments on synthetic and realworld datasets and demonstrate the usefulness of the algorithm by comparing it against standard baselines. We obtain a speedup of 1.2x to 19.5x on 24 cores with an average speed up of 10.3x across all the real world datasets.
KW - Data analysis
KW - Multifrontal methods
KW - Non-negative matrix factorization
KW - Sparse matrix computations
UR - http://www.scopus.com/inward/record.url?scp=85084000767&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-43229-4_46
DO - 10.1007/978-3-030-43229-4_46
M3 - Conference contribution
AN - SCOPUS:85084000767
SN - 9783030432287
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 543
EP - 554
BT - Parallel Processing and Applied Mathematics - 13th International Conference, PPAM 2019, Revised Selected Papers
A2 - Wyrzykowski, Roman
A2 - Karczewski, Konrad
A2 - Deelman, Ewa
A2 - Dongarra, Jack
PB - Springer
T2 - 13th International Conference on Parallel Processing and Applied Mathematics, PPAM 2019
Y2 - 8 September 2019 through 11 September 2019
ER -