Partitioning and communication strategies for sparse non-negative matrix factorization

Oguz Kaya, Ramakrishnan Kannan, Grey Ballard

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

Non-negative matrix factorization (NMF), the problem of finding two non-negative low-rank factors whose product approximates an input matrix, is a useful tool for many data mining and scientific applications such as topic modeling in text mining and unmixing in microscopy. In this paper, we focus on scaling algorithms for NMF to very large sparse datasets and massively parallel machines by employing effective algorithms, communication patterns, and partitioning schemes that leverage the sparsity of the input matrix. We consider two previous works developed for related problems, one that uses a fine-grained partitioning strategy using a point-to-point communication pattern and one that uses a Cartesian, or checkerboard, partitioning strategy using a collective-based communication pattern. We show that a combination of the previous approaches balances the demands of the various computations within NMF algorithms and achieves high efficiency and scalability. From the experiments, we see that our proposed strategy runs up to 10x faster than the state of the art on real-world datasets.

Original languageEnglish
Title of host publicationProceedings of the 47th International Conference on Parallel Processing, ICPP 2018
PublisherAssociation for Computing Machinery
ISBN (Print)9781450365109
DOIs
StatePublished - Aug 13 2018
Event47th International Conference on Parallel Processing, ICPP 2018 - Eugene, United States
Duration: Aug 13 2018Aug 16 2018

Publication series

NameACM International Conference Proceeding Series

Conference

Conference47th International Conference on Parallel Processing, ICPP 2018
Country/TerritoryUnited States
CityEugene
Period08/13/1808/16/18

Keywords

  • Distributed-memory parallel computing
  • Hypergraph partitioning
  • Sparse/dense matrix multiplication (SpMM)

Fingerprint

Dive into the research topics of 'Partitioning and communication strategies for sparse non-negative matrix factorization'. Together they form a unique fingerprint.

Cite this