Abstract
Bibliometric analysis is essential for understanding research trends, scope, and impact in urban science, especially in high-impact journals, such Nature Portfolios. However, traditional methods, relying on key-word searches and basic NLP techniques, often fail to uncover valuable insights not explicitly stated in article titles or key-words. These approaches are unable to perform semantic searches and contextual understanding, limiting their effectiveness in classifying topics and characterizing studies. In this paper, we address these limitations by leveraging Generative AI models, specifically transformers and Retrieval-Augmented Generation (RAG), to automate and enhance bibliometric analysis. We developed a technical work-flow that integrates a vector database, Sentence Transformers, a Gaussian Mixture Model (GMM), Retrieval Agent, and Large Language Models (LLMs) to enable contextual search, topic ranking, and characterization of research using customized prompt templates. A pilot study analyzing 223 urban science-related articles published in Nature Communications over the past decade highlights the effectiveness of our approach in generating insightful summary statistics on the quality, scope, and characteristics of papers in high-impact journals. This study introduces a new paradigm for enhancing bibliometric analysis and knowledge retrieval in urban research, positioning an AI agent as a powerful tool for advancing research evaluation and understanding.
| Original language | English |
|---|---|
| Title of host publication | Urban-AI 2024 - Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Advances in Urban-AI |
| Editors | Olufemi A. Omitaomu, Ali Mostafavi, Sukanya Randhawa, Haoran Niu |
| Publisher | Association for Computing Machinery, Inc |
| Pages | 43-49 |
| Number of pages | 7 |
| ISBN (Electronic) | 9798400711565 |
| DOIs | |
| State | Published - Oct 29 2024 |
| Event | 2nd ACM SIGSPATIAL International Workshop on Advances in Urban-AI, Urban-AI 2024 - Atlanta, United States Duration: Oct 29 2024 → … |
Publication series
| Name | Urban-AI 2024 - Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Advances in Urban-AI |
|---|
Conference
| Conference | 2nd ACM SIGSPATIAL International Workshop on Advances in Urban-AI, Urban-AI 2024 |
|---|---|
| Country/Territory | United States |
| City | Atlanta |
| Period | 10/29/24 → … |
Funding
This work was supported by the U.S. Department of Energy (U.S DOE), Advanced Research Projects Agency–Energy (ARPA-E) under the project #DE-AR0001780. We thank our collaborators from the University of Tennessee Knoxville.
Keywords
- Bibliometrics Analysis
- Large Language Models
- Retrieval-Augmented Generation
- Transformers