Abstract
Latent Semantic Analysis (LSA) can be used to reduce the dimensions of large Term-Document datasets using Singular Value Decomposition. However, with the ever expanding size of data sets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. The Graphics Processing Unit (GPU) can solve some highly parallel problems much faster than the traditional sequential processor (CPU). Thus, a deployable system using a GPU to speedup large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a computer cluster. In this paper, we presented a parallel LSA implementation on the GPU, using NVIDIA R Compute Unified Device Architecture (CUDA) and Compute Unified Basic Linear Algebra Subprograms (CUBLAS). The performance of this implementation is compared to traditional LSA implementation on CPU using an optimized Basic Linear Algebra Subprograms library. For large matrices that have dimensions divisible by 16, the GPU algorithm ran five to six times faster than the CPU version.
Original language | English |
---|---|
Title of host publication | Proceedings of the 11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009 |
Publisher | Association for Computing Machinery |
Pages | 2205-2209 |
Number of pages | 5 |
ISBN (Print) | 9781605583259 |
DOIs | |
State | Published - 2009 |
Event | 11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009 - Montreal, QC, Canada Duration: Jul 8 2009 → Jul 12 2009 |
Publication series
Name | Proceedings of the 11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009 |
---|---|
Volume | 2009-January |
Conference
Conference | 11th Annual Genetic and Evolutionary Computation Conference, GECCO-2009 |
---|---|
Country/Territory | Canada |
City | Montreal, QC |
Period | 07/8/09 → 07/12/09 |
Funding
This research was done at Oak Ridge National Laboratory as part of the Department of Energy’s Student Undergraduate Laboratory Internship program. Oak Ridge National Laboratory is managed by UT-Battelle LLC for the US Department of Energy under contract number DE-AC05-00OR22725. This work was supported in part by the Energy’s Student Undergraduate Laboratory Internship program, Office of Naval Research (N0001408IP20066) and Oak Ridge National Laboratory Seed Money fund (3210-2276). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Oak Ridge National Laboratory, the Office of Naval Research, the Department of Energy or the U.S. government. This research was done at Oak Ridge National Laboratory as part of the Department of Energy's Student Un- dergraduate Laboratory Internship program. Oak Ridge National Laboratory is managed by UT-Battelle LLC for the US Department of Energy under contract number DE- AC05-00OR22725. This work was supported in part by the Energy's Student Undergraduate Laboratory Internship program, Office of Naval Research (N0001408IP20066) and Oak Ridge National Laboratory Seed Money fund (3210-2276). The views and conclusions contained in this document are those of the authors and should not be interpreted as rep- resenting the official policies, either expressed or implied, of the Oak Ridge National Laboratory, the Office of Naval Re- search, the Department of Energy or the U.S. government.
Keywords
- gpu
- latent semantic indexing
- text mining