Mitigating Catastrophic Forgetting in Deep Learning in a Streaming Setting Using Historical Summary

Sajal Dash, Junqi Yin, Mallikarjun Shankar, Feiyi Wang, Wu Chun Feng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Recent advancements in scientific equipment and the adaptation of electronics and the Internet of Things (IoT) in our everyday lives resulted in large and complex data production at a high rate. Making meaningful and timely knowledge discovery at a modest cost from this big data is difficult for computing power and storage limitations. Training deep learning models incrementally in a streaming setting can help us with overcoming these limitations. However, in a well-known phenomenon named catastrophic forgetting, incrementally trained models increasingly perform poorly on the past data. To mitigate catastrophic forgetting in training in a streaming setting, we propose constructing a historical summary over time and use the summary with newly arrived data during incremental training. We propose various data summarization techniques such as random sampling, micro clustering, coreset computation, and Auto Encoders to counteract catastrophic forgetting. We built a pipeline for incremental training with a historical summary for training deep learning models for streaming data. We demonstrate the effectiveness of historical summary in mitigating catastrophic forgetting using three case studies involving three different deep learning applications: an Artificial Neural Network (ANN) for classification task on MNIST dataset, a language model (RNN-LM) on the WikiText2 dataset, and a Convolutional Neural Network (CNN), ResNet50 to classify the ImageNet dataset. Through the training of the models, we observe that catastrophic forgetting is evident in ANN and CNN but not in an RNN. For the first task, our method recovers up to 47.9% lost accuracy due to catastrophic forgetting. For the third task, the historical summary recovers classification accuracy by up to 25%. For the second task, though there is not proof of catastrophic forgetting, the training performance (PPL) improves by up to 26% with historical summary.

Original languageEnglish
Title of host publicationProceedings of DRBSD-7 2021
Subtitle of host publication7th International Workshop on Data Analysis and Reduction for Big Scientific Data, Held in conjunction with SC 2021: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages11-18
Number of pages8
ISBN (Electronic)9781728186726
DOIs
StatePublished - 2021
Event7th International Workshop on Data Analysis and Reduction for Big Scientific Data, DRBSD-7 2021 - St. Louis, United States
Duration: Nov 14 2021 → …

Publication series

NameProceedings of DRBSD-7 2021: 7th International Workshop on Data Analysis and Reduction for Big Scientific Data, Held in conjunction with SC 2021: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference7th International Workshop on Data Analysis and Reduction for Big Scientific Data, DRBSD-7 2021
Country/TerritoryUnited States
CitySt. Louis
Period11/14/21 → …

Funding

ACKNOWLEDGEMENTS This research was sponsored by and used resources of the Oak Ridge Leadership Computing Facility (OLCF), which is a DOE Office of Science User Facility at the Oak Ridge National Laboratory supported by the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

FundersFunder number
U.S. Department of EnergyDE-AC05-00OR22725
Office of Science

    Keywords

    • Catastrophic Forgetting
    • Deep Learning
    • Incremental Learning
    • Reduction
    • Streaming
    • Summary

    Fingerprint

    Dive into the research topics of 'Mitigating Catastrophic Forgetting in Deep Learning in a Streaming Setting Using Historical Summary'. Together they form a unique fingerprint.

    Cite this