The GPU enhanced parallel computing for large scale data clustering

Xiaohui Cui, Jesse St. Charles, Thomas E. Potok

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Analyzing and clustering large scale data set is a complex problem. One explored method of solving this problem borrows from nature, imitating the flocking behavior of birds. One limitation of this method of data clustering is its complexity O(n 2). As the number of data and feature dimensions grows, it becomes increasingly difficult to generate results in a reasonable amount of time. In the last few years, the graphics processing unit (GPU) has received attention for its ability to solve highly-parallel and semi-parallel problems much faster than the traditional sequential processor. In this chapter, we have conducted research to exploit this architecture and apply its strengths to the flocking based data clustering problem. Using the CUDA platform from NVIDIA, we developed a Multiple Species Data Flocking implementation to be run on the NVIDIA GPU. Performance gains ranged from 30 to 60 times improvement of the GPU over the CPU implementation.

Original languageEnglish
Title of host publicationProceedings - 2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2011
Pages220-225
Number of pages6
DOIs
StatePublished - 2011
Event3rd International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2011 - Beijing, China
Duration: Oct 10 2011Oct 12 2011

Publication series

NameProceedings - 2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2011

Conference

Conference3rd International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2011
Country/TerritoryChina
CityBeijing
Period10/10/1110/12/11

Keywords

  • GPU
  • clustering
  • flocking
  • large scale

Fingerprint

Dive into the research topics of 'The GPU enhanced parallel computing for large scale data clustering'. Together they form a unique fingerprint.

Cite this