Abstract
Most current clustering based anomaly detection methods use scoring schema and thresholds to classify anomalies. These methods are often tailored to target specific data sets with “known” number of clusters. The paper provides a streaming clustering and anomaly detection algorithm that does not require strict arbitrary thresholds on the anomaly scores or knowledge of the number of clusters while performing probabilistic anomaly detection and clustering simultaneously. This ensures that the cluster formation is not impacted by the presence of anomalous data, thereby leading to more reliable definition of “normal vs abnormal” behavior. The motivations behind developing the INCAD model [17] and the path that leads to the streaming model are discussed.
Original language | English |
---|---|
Title of host publication | Computational Science – ICCS 2019 - 19th International Conference, Proceedings |
Editors | João M.F. Rodrigues, Pedro J.S. Cardoso, Jânio Monteiro, Roberto Lam, Valeria V. Krzhizhanovskaya, Michael H. Lees, Peter M.A. Sloot, Jack J. Dongarra |
Publisher | Springer Verlag |
Pages | 45-59 |
Number of pages | 15 |
ISBN (Print) | 9783030227463 |
DOIs | |
State | Published - 2019 |
Externally published | Yes |
Event | 19th International Conference on Computational Science, ICCS 2019 - Faro, Portugal Duration: Jun 12 2019 → Jun 14 2019 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 11539 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 19th International Conference on Computational Science, ICCS 2019 |
---|---|
Country/Territory | Portugal |
City | Faro |
Period | 06/12/19 → 06/14/19 |
Funding
Acknowledgements. The authors would like to acknowledge University at Buffalo Center for Computational Research (http://www.buffalo.edu/ccr.html) for its computing resources that were made available for conducting the research reported in this paper. Financial support of the National Science Foundation Grant numbers NSF/OAC 1339765 and NSF/DMS 1621853 is acknowledged. The authors would like to acknowledge University at Buffalo Center for Computational Research (http://www.buffalo.edu/ccr.html) for its computing resources that were made available for conducting the research reported in this paper. Financial support of the National Science Foundation Grant numbers NSF/OAC 1339765 and NSF/DMS 1621853 is acknowledged.
Keywords
- Anomaly detection
- Bayesian non-parametric models
- Clustering based anomaly detection
- Extreme value theory