Abstract
Knowledge-driven discovery of novel materials necessitates the development of causal models for property emergence. While in the classical physical paradigm, the causal relationships are deduced based on physical principles or via experiment, the rapid accumulation of observational data necessitates learning causal relationships between dissimilar aspects of material structure and functionalities based on observations. For this, it is essential to integrate experimental data with prior domain knowledge. Here, we demonstrate this approach by combining high-resolution scanning transmission electron microscopy data with insights derived from large language models (LLMs). By applying ChatGPT to domain-specific literature, such as arXiv papers on ferroelectrics, and combining the obtained information with data-driven causal discovery, we construct adjacency matrices for directed acyclic graphs that map the causal relationships between structural, chemical, and polarization degrees of freedom in Sm-doped BiFeO3. This approach enables us to hypothesize how synthesis conditions influence material properties and guides experimental validation. The ultimate objective of this work is to develop a unified framework that integrates LLM-driven literature analysis with data-driven discovery, facilitating the precise engineering of ferroelectric materials by establishing clear connections between synthesis conditions and their resulting material properties.
| Original language | English |
|---|---|
| Article number | 121904 |
| Journal | Applied Physics Letters |
| Volume | 127 |
| Issue number | 12 |
| DOIs | |
| State | Published - Sep 22 2025 |
Funding
This work (workflow development and concept) was supported (K.B. and S.V.K.) as part of the center for 3D Ferroelectric Microelectronics Manufacturing (3DFeM2), an Energy Frontier Research Center funded by the U.S. Department of Energy (DOE), Office of Science, Basic Energy Sciences under Award Number DE-SC0021118. STEM imaging (C.N.) was performed at the Oak Ridge National Laboratory's Center for Nanophase Materials Sciences (CNMS). The work at the University of Maryland was supported in part by the National Institute of Standards and Technology Cooperative Agreement No. 70NANB17H301 and the Center for Spintronic Materials in Advanced infoRmation Technologies (SMART), one of the centers in nCORE, a Semiconductor Research Corporation (SRC) program sponsored by NSF and NIST.