Analyzing and Explaining Black-Box Models for Online Malware Detection

Research output: Contribution to journalArticlepeer-review

30 Scopus citations

Abstract

In recent years, a significant amount of research has focused on analyzing the effectiveness of machine learning (ML) models for malware detection. These approaches have ranged from methods such as decision trees and clustering to more complex approaches like support vector machine (SVM) and deep neural networks. In particular, neural networks have proven to be very effective in detecting complex and advanced malware. This, however, comes with a caveat. Neural networks are notoriously complex. Therefore, the decisions that they make are often just accepted without questioning why the model made that specific decision. The black box characteristic of neural networks has challenged researchers to explore methods to explain black-box models such as SVM and neural networks and their decision-making process. Transparency and explainability give the experts and malware analysts assurance and trustworthiness about the ML models' decisions. In addition, it helps in generating comprehensive reports that can be used to enhance cyber threat intelligence sharing. As such, this much-needed analysis drives our work in this paper to explore the explainability and interpretability of ML models in the field of online malware detection. In this paper, we used the Shapley Additive exPlanations (SHAP) explainability technique to achieve efficient performance in interpreting the outcome of different ML models such as SVM Linear, SVM-RBF (Radial Basis Function), Random Forest (RF), Feed-Forward Neural Net (FFNN), and Convolutional Neural Network (CNN) models trained on an online malware dataset. To explain the output of these models, explainability techniques such as KernalSHAP, TreeSHAP, and DeepSHAP are applied to the obtained results.

Original languageEnglish
Pages (from-to)25237-25252
Number of pages16
JournalIEEE Access
Volume11
DOIs
StatePublished - 2023

Funding

This work was supported in part by the National Science Foundation at North Carolina Agricultural and Technical State University under Grant 2150297, and in part by Tennessee Tech University under Grant 2025682 and Grant 2043324.

Keywords

  • explainability
  • Explainable AI
  • explainable malware analysis
  • feature contribution
  • interpretability
  • online malware detection
  • SHAP

Fingerprint

Dive into the research topics of 'Analyzing and Explaining Black-Box Models for Online Malware Detection'. Together they form a unique fingerprint.

Cite this