PACE: Pattern accurate computationally efficient bootstrapping for timely discovery of cyber-security concepts

Nikki McNeil, Robert A. Bridges, Michael D. Iannacone, Bogdan Czejdo, Nicolas Perez, John R. Goodall

Research output: Contribution to conferencePaperpeer-review

34 Scopus citations

Abstract

Public disclosure of important security information, such as knowledge of vulnerabilities or exploits, often occurs in blogs, tweets, mailing lists, and other online sources significantly before proper classification into structured databases. In order to facilitate timely discovery of such knowledge, we propose a novel semi-supervised learning algorithm, PACE, for identifying and classifying relevant entities in text sources. The main contribution of this paper is an enhancement of the traditional bootstrapping method for entity extraction by employing a time-memory trade-off that simultaneously circumvents a costly corpus search while strengthening pattern nomination, which should increase accuracy. An implementation in the cyber-security domain is discussed as well as challenges to Natural Language Processing imposed by the security domain.

Original languageEnglish
Pages60-65
Number of pages6
DOIs
StatePublished - 2013
Event2013 12th International Conference on Machine Learning and Applications, ICMLA 2013 - Miami, FL, United States
Duration: Dec 4 2013Dec 7 2013

Conference

Conference2013 12th International Conference on Machine Learning and Applications, ICMLA 2013
Country/TerritoryUnited States
CityMiami, FL
Period12/4/1312/7/13

Keywords

  • Bootstrapping
  • Cyber-Security
  • Entity Extraction
  • Natural Language Processing

Fingerprint

Dive into the research topics of 'PACE: Pattern accurate computationally efficient bootstrapping for timely discovery of cyber-security concepts'. Together they form a unique fingerprint.

Cite this