TY - GEN
T1 - Outlier detection for text data
AU - Kannan, Ramakrishnan
AU - Woo, Hyenkyun
AU - Aggarwal, Charu C.
AU - Park, Haesun
N1 - Publisher Copyright:
Copyright © by SIAM.
PY - 2017
Y1 - 2017
N2 - The problem of outlier detection is extremely challenging in many domains such as text, in which the attribute values are typically non-negative, and most values are zero. In such cases, it often becomes difficult to separate the outliers from the natural variations in the patterns in the underlying data. In this paper, we present a matrix factorization method, which is naturally able to distinguish the anomalies with the use of low rank approximations of the underlying data. Our iterative algorithm TONMF is based on Block Coordinate Descent (BCD) framework. Our approach has significant advantages over traditional methods for text outlier detection. Finally, we present experimental results illustrating the effectiveness of our method over competing methods.
AB - The problem of outlier detection is extremely challenging in many domains such as text, in which the attribute values are typically non-negative, and most values are zero. In such cases, it often becomes difficult to separate the outliers from the natural variations in the patterns in the underlying data. In this paper, we present a matrix factorization method, which is naturally able to distinguish the anomalies with the use of low rank approximations of the underlying data. Our iterative algorithm TONMF is based on Block Coordinate Descent (BCD) framework. Our approach has significant advantages over traditional methods for text outlier detection. Finally, we present experimental results illustrating the effectiveness of our method over competing methods.
UR - http://www.scopus.com/inward/record.url?scp=85027865472&partnerID=8YFLogxK
U2 - 10.1137/1.9781611974973.55
DO - 10.1137/1.9781611974973.55
M3 - Conference contribution
AN - SCOPUS:85027865472
T3 - Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017
SP - 489
EP - 497
BT - Proceedings of the 17th SIAM International Conference on Data Mining, SDM 2017
A2 - Chawla, Nitesh
A2 - Wang, Wei
PB - Society for Industrial and Applied Mathematics Publications
T2 - 17th SIAM International Conference on Data Mining, SDM 2017
Y2 - 27 April 2017 through 29 April 2017
ER -