TY - GEN
T1 - HedgePeer
T2 - 22nd ACM/IEEE Joint Conference on Digital Libraries, JCDL 2022
AU - Ghosal, Tirthankar
AU - Varanasi, Kamal Kaushik
AU - Kordoni, Valia
N1 - Publisher Copyright:
© 2022 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.
PY - 2022/6/20
Y1 - 2022/6/20
N2 - Uncertainty detection from text is essential in many applications in information retrieval (IR). Detecting textual uncertainties helps extract factual information instead of uncertain or non-factual information. To avoid overprecise commitment, people use linguistic devices like hedges (uncertain words or phrases). In peer reviews, reviewers often use hedges wherever they are unsure about their opinion or when facts do not back their opinions. Usage of hedges or uncertain words in writing can also indicate the reviewer's confidence or measure of conviction in their reviews. Reviewer confidence is important in the peer review process (especially to the editors or chairs) to judge the quality of evaluation of the paper under review. However, the self-Annotated reviewer confidence score is often miscalibrated or biased and not an accurate representation of the reviewer's conviction of their judgment on the merit of the paper. Less confident reviewers sometimes speculate their observations. Here in this paper, we introduce HedgePeer, a new uncertainty detection dataset of peer review comments, which is more than five times larger than the existing datasets on hedge detection in other domains.We curate our dataset from the open-Access reviews available in the open review platform and annotate the review comments in terms of the hedge cues and hedge spans. We also provide several baseline approaches, including a multitask learning model with sentiment intensity and parts-of-speech as scaffold tasks to predict hedge cues and spans.We make our dataset and baseline codes available at https://github.com/Tirthankar-Ghosal/HedgePeer-Dataset. Our dataset is motivated towards computationally estimating the reviewer's conviction from their review texts.
AB - Uncertainty detection from text is essential in many applications in information retrieval (IR). Detecting textual uncertainties helps extract factual information instead of uncertain or non-factual information. To avoid overprecise commitment, people use linguistic devices like hedges (uncertain words or phrases). In peer reviews, reviewers often use hedges wherever they are unsure about their opinion or when facts do not back their opinions. Usage of hedges or uncertain words in writing can also indicate the reviewer's confidence or measure of conviction in their reviews. Reviewer confidence is important in the peer review process (especially to the editors or chairs) to judge the quality of evaluation of the paper under review. However, the self-Annotated reviewer confidence score is often miscalibrated or biased and not an accurate representation of the reviewer's conviction of their judgment on the merit of the paper. Less confident reviewers sometimes speculate their observations. Here in this paper, we introduce HedgePeer, a new uncertainty detection dataset of peer review comments, which is more than five times larger than the existing datasets on hedge detection in other domains.We curate our dataset from the open-Access reviews available in the open review platform and annotate the review comments in terms of the hedge cues and hedge spans. We also provide several baseline approaches, including a multitask learning model with sentiment intensity and parts-of-speech as scaffold tasks to predict hedge cues and spans.We make our dataset and baseline codes available at https://github.com/Tirthankar-Ghosal/HedgePeer-Dataset. Our dataset is motivated towards computationally estimating the reviewer's conviction from their review texts.
KW - Hedges
KW - Peer Reviews
KW - Reviewer Confidence
KW - Uncertainty Detection
UR - http://www.scopus.com/inward/record.url?scp=85133224884&partnerID=8YFLogxK
U2 - 10.1145/3529372.3533300
DO - 10.1145/3529372.3533300
M3 - Conference contribution
AN - SCOPUS:85133224884
T3 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries
BT - JCDL 2022 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 20 June 2022 through 24 June 2022
ER -