TY - JOUR
T1 - Understanding the impact of climate change on critical infrastructure through nlp analysis of scientific literature
AU - Mallick, Tanwi
AU - Bergerson, Joshua David
AU - Verner, Duane R.
AU - Hutchison, John K.
AU - Levy, Leslie Anne
AU - Balaprakash, Prasanna
N1 - Publisher Copyright:
© This material is published by permission of the Argonne National Laboratory, operated by UChicago Argonne, LLC for the US Department of Energy under Contract No. DE-AC02-06CH11357. The US Government retains for itself, and others acting on its behalf, a paid-up, non-exclusive, and irrevocable worldwide licence in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government.
PY - 2024
Y1 - 2024
N2 - Climate change is intensifying natural hazards, putting critical infrastructure systems at risk. The effects of climate change on critical infrastructure can be significant, and communities need to consider these risks when planning and designing infrastructure systems for the future. To that end, natural language processing (NLP) is a promising approach for analyzing large volumes of climate change and infrastructure-related scientific literature. To train a supervised model using NLP techniques, a significant subset of the corpus must be labeled into categories based on user-defined criteria, which is a time-consuming process. To expedite this process, we developed a weak supervision-based approach that leverages semantic similarity between categories and documents to generate category labels for the domain-specific corpus. In comparison with a months-long process of subject-matter expert labeling, we assign category labels to the whole corpus using weak supervision and supervised learning in 13 hours.
AB - Climate change is intensifying natural hazards, putting critical infrastructure systems at risk. The effects of climate change on critical infrastructure can be significant, and communities need to consider these risks when planning and designing infrastructure systems for the future. To that end, natural language processing (NLP) is a promising approach for analyzing large volumes of climate change and infrastructure-related scientific literature. To train a supervised model using NLP techniques, a significant subset of the corpus must be labeled into categories based on user-defined criteria, which is a time-consuming process. To expedite this process, we developed a weak supervision-based approach that leverages semantic similarity between categories and documents to generate category labels for the domain-specific corpus. In comparison with a months-long process of subject-matter expert labeling, we assign category labels to the whole corpus using weak supervision and supervised learning in 13 hours.
KW - BERT embedding
KW - Climate hazards
KW - critical infrastructures
KW - natural language processing
KW - weak supervision
UR - http://www.scopus.com/inward/record.url?scp=85197220821&partnerID=8YFLogxK
U2 - 10.1080/23789689.2024.2355772
DO - 10.1080/23789689.2024.2355772
M3 - Article
AN - SCOPUS:85197220821
SN - 2378-9689
JO - Sustainable and Resilient Infrastructure
JF - Sustainable and Resilient Infrastructure
ER -