To Comprehend the New: On Measuring the Freshness of a Document

Tirthankar Ghosal, Abhishek Shukla, Asif Ekbal, Pushpak Bhattacharyya

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Detecting the novelty or freshness of an entire document is essential in this age of data duplication and semanticlevel redundancy all across the web. Current techniques for the problem mostly root on handcrafted similarity and divergence based measures to classify a document as novel or nonnovel. However, document-level novelty detection is relatively less explored in literature if compared to its sentence-level counterpart. In this work, we present a deep neural architecture to automatically predict the amount of new information contained in a document in the form of a novelty score. Along with, we offer a dataset of more than 7500 documents, annotated at the sentence-level to facilitate further research. Our approach which learns the notion of novelty and redundancy only from the data achieves significant performance improvement over the existing methods and adopted baselines (@17% error reduction). Also, our approach complies with the Two-Stage theory of human recall essential to comprehend new information.

Original languageEnglish
Title of host publication2019 International Joint Conference on Neural Networks, IJCNN 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728119854
DOIs
StatePublished - Jul 2019
Externally publishedYes
Event2019 International Joint Conference on Neural Networks, IJCNN 2019 - Budapest, Hungary
Duration: Jul 14 2019Jul 19 2019

Publication series

NameProceedings of the International Joint Conference on Neural Networks
Volume2019-July

Conference

Conference2019 International Joint Conference on Neural Networks, IJCNN 2019
Country/TerritoryHungary
CityBudapest
Period07/14/1907/19/19

Funding

VIII. ACKNOWLEDGEMENT The first author and Asif Ekbal acknowledge the Visves-varaya PhD scheme for Electronics and IT and Visvesvaraya YFRF respectively under Ministry of Electronics and Information Technology (MeitY), Government of India for support.

Keywords

  • document classification
  • document-level novelty
  • novelty score

Fingerprint

Dive into the research topics of 'To Comprehend the New: On Measuring the Freshness of a Document'. Together they form a unique fingerprint.

Cite this