Abstract
It has long been a norm that researchers extract knowledge from literature to design materials. However, the avalanche of publications makes the norm challenging to follow. Text mining (TM) is efficient in extracting information from corpora. Still, it cannot discover materials not present in the corpora, hindering its broader applications in exploring novel materials, such as high-entropy alloys (HEAs). Here we introduce a concept of “context similarity" for selecting chemical elements for HEAs, based on TM models that analyze the abstracts of 6.4 million papers. The method captures the similarity of chemical elements in the context used by scientists. It overcomes the limitations of TM and identifies the Cantor and Senkov HEAs. We demonstrate its screening capability for six- and seven-component lightweight HEAs by finding nearly 500 promising alloys out of 2.6 million candidates. The method thus brings an approach to the development of ultrahigh-entropy alloys and multicomponent materials.
Original language | English |
---|---|
Article number | 54 |
Journal | Nature Communications |
Volume | 14 |
Issue number | 1 |
DOIs | |
State | Published - Dec 2023 |
Funding
We appreciate the proofreading and valuable comments of Dr. Anubhav Jain of Lawrence Berkeley National Laboratory and John Dagdelen of the University of California Berkeley. This research used resources from New York University’s Greene supercomputer and Oak Ridge National Laboratory’s Compute and Data Environment for Science (CADES) and the Oak Ridge Leadership Computing Facility (OLCF). The latter is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. One of us (P.K.L.) very much appreciates the support from (1) the National Science Foundation (DMR-1611180 and 1809640) with program directors, Drs. J. Yang, G. Shiflet, and D. Farkas and (2) the US Army Research Office (W911NF-13-1-0438 and W911NF-19-2-0049) with program managers, Drs. M.P. Bakas, S.N. Mathaudhu, and D.M. Stepp. We appreciate the proofreading and valuable comments of Dr. Anubhav Jain of Lawrence Berkeley National Laboratory and John Dagdelen of the University of California Berkeley. This research used resources from New York University’s Greene supercomputer and Oak Ridge National Laboratory’s Compute and Data Environment for Science (CADES) and the Oak Ridge Leadership Computing Facility (OLCF). The latter is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. One of us (P.K.L.) very much appreciates the support from (1) the National Science Foundation (DMR-1611180 and 1809640) with program directors, Drs. J. Yang, G. Shiflet, and D. Farkas and (2) the US Army Research Office (W911NF-13-1-0438 and W911NF-19-2-0049) with program managers, Drs. M.P. Bakas, S.N. Mathaudhu, and D.M. Stepp.
Funders | Funder number |
---|---|
CADES | |
Data Environment for Science | |
John Dagdelen of the University of California Berkeley | |
Oak Ridge National Laboratory | |
National Science Foundation | DMR-1611180, 1809640 |
U.S. Department of Energy | DE-AC05-00OR22725 |
Army Research Office | W911NF-19-2-0049, W911NF-13-1-0438 |
Office of Science | |
Oak Ridge National Laboratory | |
Lawrence Berkeley National Laboratory |