Abstract
Generating scientifically grounded hypotheses is a challenging frontier task for generative AI models in science. The difficulty arises from the inherent subjectivity of the task and the extensive knowledge of prior work required to assess the validity of a generated hypothesis. Large Language Models (LLMs), trained on vast datasets from diverse sources, have shown a strong ability to utilize the knowledge embedded in their training data. Recent research has explored using transformer-based models for scientific hypothesis generation, leveraging their advanced capabilities. However, these models often require a significant number of parameters to manage long sequences, which can be a limitation. State Space Models, such as Mamba, offer an alternative by effectively handling very long sequences with fewer parameters than transformers. In this work, we investigate the use of Mamba for scientific hypothesis generation. Our preliminary findings indicate that Mamba achieves similar performance w.r.t. transformer-based models of similar sizes for a higher-order complex task like hypothesis generation. We have made our code available here: https://github.com/fglx-c/ExploringScientific-Hypothesis-Generation-with-Mamba.
| Original language | English |
|---|---|
| Title of host publication | NLP4Science 2024 - 1st Workshop on NLP for Science, Proceedings of the Workshop |
| Editors | Lotem Peled-Cohen, Nitay Calderon, Shir Lissak, Roi Reichart |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 197-207 |
| Number of pages | 11 |
| ISBN (Electronic) | 9798891761858 |
| DOIs | |
| State | Published - 2024 |
| Event | 1st Workshop on NLP for Science, NLP4Science 2024 - Miami, United States Duration: Nov 16 2024 → … |
Publication series
| Name | NLP4Science 2024 - 1st Workshop on NLP for Science, Proceedings of the Workshop |
|---|
Conference
| Conference | 1st Workshop on NLP for Science, NLP4Science 2024 |
|---|---|
| Country/Territory | United States |
| City | Miami |
| Period | 11/16/24 → … |
Funding
This research used resources of the Oak Ridge Leadership Computing Facility (OLCF), which is a DOE Office of Science User Facility at the Oak Ridge National Laboratory supported by the U.S. Department of Energy under Contract No. DEAC05-00OR22725.
Fingerprint
Dive into the research topics of 'Exploring Scientific Hypothesis Generation with Mamba'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver