TY - GEN
T1 - Exploring Scientific Hypothesis Generation with Mamba
AU - Chai, Miaosen
AU - Herron, Emily
AU - Cervantes, Erick
AU - Ghosal, Tirthankar
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - Generating scientifically grounded hypotheses is a challenging frontier task for generative AI models in science. The difficulty arises from the inherent subjectivity of the task and the extensive knowledge of prior work required to assess the validity of a generated hypothesis. Large Language Models (LLMs), trained on vast datasets from diverse sources, have shown a strong ability to utilize the knowledge embedded in their training data. Recent research has explored using transformer-based models for scientific hypothesis generation, leveraging their advanced capabilities. However, these models often require a significant number of parameters to manage long sequences, which can be a limitation. State Space Models, such as Mamba, offer an alternative by effectively handling very long sequences with fewer parameters than transformers. In this work, we investigate the use of Mamba for scientific hypothesis generation. Our preliminary findings indicate that Mamba achieves similar performance w.r.t. transformer-based models of similar sizes for a higher-order complex task like hypothesis generation. We have made our code available here: https://github.com/fglx-c/ExploringScientific-Hypothesis-Generation-with-Mamba.
AB - Generating scientifically grounded hypotheses is a challenging frontier task for generative AI models in science. The difficulty arises from the inherent subjectivity of the task and the extensive knowledge of prior work required to assess the validity of a generated hypothesis. Large Language Models (LLMs), trained on vast datasets from diverse sources, have shown a strong ability to utilize the knowledge embedded in their training data. Recent research has explored using transformer-based models for scientific hypothesis generation, leveraging their advanced capabilities. However, these models often require a significant number of parameters to manage long sequences, which can be a limitation. State Space Models, such as Mamba, offer an alternative by effectively handling very long sequences with fewer parameters than transformers. In this work, we investigate the use of Mamba for scientific hypothesis generation. Our preliminary findings indicate that Mamba achieves similar performance w.r.t. transformer-based models of similar sizes for a higher-order complex task like hypothesis generation. We have made our code available here: https://github.com/fglx-c/ExploringScientific-Hypothesis-Generation-with-Mamba.
UR - http://www.scopus.com/inward/record.url?scp=85216933682&partnerID=8YFLogxK
U2 - 10.18653/v1/2024.nlp4science-1.17
DO - 10.18653/v1/2024.nlp4science-1.17
M3 - Conference contribution
AN - SCOPUS:85216933682
T3 - NLP4Science 2024 - 1st Workshop on NLP for Science, Proceedings of the Workshop
SP - 197
EP - 207
BT - NLP4Science 2024 - 1st Workshop on NLP for Science, Proceedings of the Workshop
A2 - Peled-Cohen, Lotem
A2 - Calderon, Nitay
A2 - Lissak, Shir
A2 - Reichart, Roi
PB - Association for Computational Linguistics (ACL)
T2 - 1st Workshop on NLP for Science, NLP4Science 2024
Y2 - 16 November 2024
ER -