Team Innovators at SemEval-2022 for Task 8: Multi-Task Training with Hyperpartisan and Semantic Relation for Multi-Lingual News Article Similarity

Nidhir Bhavsar, Rishikesh Devanathan, Aakash Bhatnagar, Muskaan Singh, Petr Motlicek, Tirthankar Ghosal

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

This work represents the system proposed by team Innovators for SemEval 2022 Task 8: Multilingual News Article Similarity (Chen et al., 2022). Similar multilingual news articles should match irrespective of the style of writing, the language of conveyance, and subjective decisions and biases induced by medium/outlet. The proposed architecture includes a machine translation system that translates multilingual news articles into English and presents a multitask learning model trained simultaneously on three distinct datasets. The system leverages the PageRank algorithm for Long-form text alignment. Multitask learning approach allows simultaneous training of multiple tasks while sharing the same encoder during training, facilitating knowledge transfer between tasks. Our best model is ranked 16 with a Pearson score of 0.733. We make our code accessible here.

Original languageEnglish
Title of host publicationSemEval 2022 - 16th International Workshop on Semantic Evaluation, Proceedings of the Workshop
EditorsGuy Emerson, Natalie Schluter, Gabriel Stanovsky, Ritesh Kumar, Alexis Palmer, Nathan Schneider, Siddharth Singh, Shyam Ratan
PublisherAssociation for Computational Linguistics (ACL)
Pages1163-1170
Number of pages8
ISBN (Electronic)9781955917803
StatePublished - 2022
Externally publishedYes
Event16th International Workshop on Semantic Evaluation, SemEval 2022 - Seattle, United States
Duration: Jul 14 2022Jul 15 2022

Publication series

NameSemEval 2022 - 16th International Workshop on Semantic Evaluation, Proceedings of the Workshop

Conference

Conference16th International Workshop on Semantic Evaluation, SemEval 2022
Country/TerritoryUnited States
CitySeattle
Period07/14/2207/15/22

Funding

This work was supported by the European Union's Horizon 2020 research and innovation program under grant agreement No. 833635 (project ROXANNE: Real-time network, text, and speaker analytics for combating organized crime, 2019-2022). This work was supported by the European Union’s Horizon 2020 research and innovation program under grant agreement No. 833635 (project ROX-ANNE: Real-time network, text, and speaker analytics for combating organized crime, 2019-2022).

Fingerprint

Dive into the research topics of 'Team Innovators at SemEval-2022 for Task 8: Multi-Task Training with Hyperpartisan and Semantic Relation for Multi-Lingual News Article Similarity'. Together they form a unique fingerprint.

Cite this