Hierarchical Convolutional Attention Networks for Text Classification

Shang Gao, Arvind Ramanathan, Georgia Tourassi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

51 Scopus citations

Abstract

Recent work in machine translation has demonstrated that self-attention mechanisms can be used in place of recurrent neural networks to increase training speed without sacrificing model accuracy. We propose combining this approach with the benefits of convolutional filters and a hierarchical structure to create a document classification model that is both highly accurate and fast to train - we name our method Hierarchical Convolutional Attention Networks. We demonstrate the effectiveness of this architecture by surpassing the accuracy of the current state-of-the-art on several classification tasks while being twice as fast to train.

Original languageEnglish
Title of host publicationACL 2018 - Representation Learning for NLP, Proceedings of the 3rd Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages11-23
Number of pages13
ISBN (Electronic)9781948087438
StatePublished - 2018
Event3rd Workshop on Representation Learning for NLP, RepL4NLP 2018 at the 56th Annual Meeting of the Association for Computational Linguistics ACL 2018 - Melbourne, Australia
Duration: Jul 20 2018 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference3rd Workshop on Representation Learning for NLP, RepL4NLP 2018 at the 56th Annual Meeting of the Association for Computational Linguistics ACL 2018
Country/TerritoryAustralia
CityMelbourne
Period07/20/18 → …

Funding

This work has been supported in part by the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C) program established by the U.S. Department of Energy (DOE) and the National Cancer Institute (NCI) of the National Institutes of Health. This work was performed under the auspices of the U.S. Department of Energy by Argonne National Laboratory under Contract DE-AC02-06-CH11357, Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, Los Alamos National Laboratory under Contract DE-AC5206NA25396, and Oak Ridge National Laboratory under Contract DE-AC05-00OR22725. This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

Fingerprint

Dive into the research topics of 'Hierarchical Convolutional Attention Networks for Text Classification'. Together they form a unique fingerprint.

Cite this