Abstract
Understanding newly emerging events or topics associated with a particular region of a given day can provide deep insight on the critical events occurring in highly evolving metropolitan cities. We propose herein a novel topic modeling approach on text documents with spatio-temporal information (e.g., when and where a document was published) such as location-based social media data to discover prevalent topics or newly emerging events with respect to an area and a time point. We consider a map view composed of regular grids or tiles with each showing topic keywords from documents of the corresponding region. To this end, we present a tilebased spatio-temporally exclusive topic modeling approach called STExNMF, based on a novel nonnegative matrix factorization (NMF) technique. STExNMF mainly works based on the two following stages: (1) first running a standard NMF of each tile to obtain general topics of the tile and (2) running a spatiotemporally exclusive NMF on a weighted residual matrix. These topics likely reveal information on newly emerging events or topics of interest within a region. We demonstrate the advantages of our approach using the geo-tagged Twitter data of New York City. We also provide quantitative comparisons in terms of the topic quality, spatio-temporal exclusiveness, topic variation, and qualitative evaluations of our method using several usage scenarios. In addition, we present a fast topic modeling technique of our model by leveraging parallel computing.
Original language | English |
---|---|
Title of host publication | Proceedings - 17th IEEE International Conference on Data Mining, ICDM 2017 |
Editors | George Karypis, Srinivas Alu, Vijay Raghavan, Xindong Wu, Lucio Miele |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 435-444 |
Number of pages | 10 |
ISBN (Electronic) | 9781538638347 |
DOIs | |
State | Published - Dec 15 2017 |
Event | 17th IEEE International Conference on Data Mining, ICDM 2017 - New Orleans, United States Duration: Nov 18 2017 → Nov 21 2017 |
Publication series
Name | Proceedings - IEEE International Conference on Data Mining, ICDM |
---|---|
Volume | 2017-November |
ISSN (Print) | 1550-4786 |
Conference
Conference | 17th IEEE International Conference on Data Mining, ICDM 2017 |
---|---|
Country/Territory | United States |
City | New Orleans |
Period | 11/18/17 → 11/21/17 |
Funding
This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. DOE and supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIP) (No. NRF-2016R1C1B2015924). The DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Keywords
- Anomaly detection
- Event detection
- Matrix factorization
- Social network analysis
- Topic modeling