An Accuracy-Maximization Approach for Claims Classifiers in Document Content Analytics for Cybersecurity †

  • Kimia Ameri
  • , Michael Hempel
  • , Hamid Sharif
  • , Juan Lopez
  • , Kalyan Perumalla

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

This paper presents our research approach and findings towards maximizing the accuracy of our classifier of feature claims for cybersecurity literature analytics, and introduces the resulting model ClaimsBERT. Its architecture, after extensive evaluations of different approaches, introduces a feature map concatenated with a Bidirectional Encoder Representation from Transformers (BERT) model. We discuss deployment of this new concept and the research insights that resulted in the selection of Convolution Neural Networks for its feature mapping aspects. We also present our results showing ClaimsBERT to outperform all other evaluated approaches. This new claims classifier represents an essential processing stage within our vetting framework aiming to improve the cybersecurity of industrial control systems (ICS). Furthermore, in order to maximize the accuracy of our new ClaimsBERT classifier, we propose an approach for optimal architecture selection and determination of optimized hyperparameters, in particular the best learning rate, number of convolutions, filter sizes, activation function, the number of dense layers, as well as the number of neurons and the drop-out rate for each layer. Fine-tuning these hyperparameters within our model led to an increase in classification accuracy from 76% obtained with BertForSequenceClassification’s original model to a 97% accuracy obtained with ClaimsBERT.

Original languageEnglish
Pages (from-to)418-443
Number of pages26
JournalJournal of Cybersecurity and Privacy
Volume2
Issue number2
DOIs
StatePublished - Jun 2022

Funding

This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The publisher acknowledges the US government license to provide public access under the DOE Public Access Plan ( http://energy.gov/downloads/doe-public-access-plan ). This research has been supported in part by the Department of Energy Cybersecurity for Energy Delivery Systems program, and the Oak Ridge National Laboratory. This research was funded by the US. Dept of Energy through a subcontract from Oak Ridge National Laboratory, project No. 4000175929 (project CYVET).

Keywords

  • accuracy maximization
  • BERT
  • classification
  • convolution neural network
  • cybersecurity
  • CYVET
  • natural language processing
  • transfer learning

Fingerprint

Dive into the research topics of 'An Accuracy-Maximization Approach for Claims Classifiers in Document Content Analytics for Cybersecurity †'. Together they form a unique fingerprint.

Cite this