Abstract
The problem of outlier detection is extremely challenging in many domains such as text, in which the attribute values are typically non-negative, and most values are zero. In such cases, it often becomes difficult to separate the outliers from the natural variations in the patterns in the underlying data. In this paper, we present a matrix factorization method, which is naturally able to distinguish the anomalies with the use of low rank approximations of the underlying data. Our iterative algorithm TONMF is based on Block Coordinate Descent (BCD) framework. Our approach has significant advantages over traditional methods for text outlier detection. Finally, we present experimental results illustrating the effectiveness of our method over competing methods.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017 |
| Editors | Nitesh Chawla, Wei Wang |
| Publisher | Society for Industrial and Applied Mathematics Publications |
| Pages | 489-497 |
| Number of pages | 9 |
| ISBN (Electronic) | 9781611974874 |
| DOIs | |
| State | Published - 2017 |
| Event | 17th SIAM International Conference on Data Mining, SDM 2017 - Houston, United States Duration: Apr 27 2017 → Apr 29 2017 |
Publication series
| Name | Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017 |
|---|
Conference
| Conference | 17th SIAM International Conference on Data Mining, SDM 2017 |
|---|---|
| Country/Territory | United States |
| City | Houston |
| Period | 04/27/17 → 04/29/17 |
Funding
This manuscript has been co-authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. This project was partially funded by the Laboratory Director’s Research and Development fund, National Science Foundation (NSF) grant IIS-1348152, Defense Advanced Research Projects Agency (DARPA) XDATA program grant FA8750-12-2-0309 and also sponsored by the Army Research Laboratory (ARL) accomplished under Cooperative Agreement Number W911NF-09-2-0053. Also, H. Woo is supported by NRF-2015R101A1A01061261.