Abstract
Science communication, in layperson's terms, is essential to reach the general population and also maximize the impact of underlying scientific research. Hence, good science blogs and journalistic reviews of research articles are so well-read and critical to conveying science. Scientific blogging goes beyond traditional research summaries, offering experts a platform to articulate findings in layperson's terms. It bridges the gap between intricate research and its comprehension by the general public, policymakers, and other researchers. Amid the rapid expansion of scientific data and the accelerating pace of research, credible science blogs serve as vital artifacts for evidence-based information to the general non-expert audience. However, writing a scientific blog or even a short lay summary requires significant time and effort. Here, we are intrigued by the question: What if the process of writing a scientific blog based on a given paper could be semi-automated to produce the first draft? In this paper, we introduce a novel task of Artificial Intelligence (AI)-based science blog generation from a research article. We leverage the idea that presentations and science blogs share a symbiotic relationship in their aim to clarify and elucidate complex scientific concepts. Both rely on visuals, such as figures, to aid comprehension. With this motivation, we create a new dataset of science blogs using presentation transcript and its corresponding slides. We create a dataset containing a paper's presentation transcript and figures annotated from nearly 3000 papers. We then propose a multimodal attention model to generate a blog text and select the most relevant figures to explain a research article in layperson's terms, essentially a science blog. Our experimental results with respect to both automatic and human evaluation metrics show the effectiveness of our proposed approach and the usefulness of our proposed dataset.
Original language | English |
---|---|
Title of host publication | 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings |
Editors | Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue |
Publisher | European Language Resources Association (ELRA) |
Pages | 10790-10801 |
Number of pages | 12 |
ISBN (Electronic) | 9782493814104 |
State | Published - 2024 |
Event | Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, Italy Duration: May 20 2024 → May 25 2024 |
Publication series
Name | 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings |
---|
Conference
Conference | Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 |
---|---|
Country/Territory | Italy |
City | Hybrid, Torino |
Period | 05/20/24 → 05/25/24 |
Funding
Sandeep Kumar acknowledges the Prime Minister Research Fellowship (PMRF) program of the Govt of India for its support.
Keywords
- Blog Generation
- Image Retrieval
- Multimodal
- Multioutput
- Research Article Summarization