Abstract
ChemML is an open machine learning (ML) and informatics program suite that is designed to support and advance the data-driven research paradigm that is currently emerging in the chemical and materials domain. ChemML allows its users to perform various data science tasks and execute ML workflows that are adapted specifically for the chemical and materials context. Key features are automation, general-purpose utility, versatility, and user-friendliness in order to make the application of modern data science a viable and widely accessible proposition in the broader chemistry and materials community. ChemML is also designed to facilitate methodological innovation, and it is one of the cornerstones of the software ecosystem for data-driven in silico research. This article is categorized under: Software > Simulation Methods Computer and Information Science > Chemoinformatics Structure and Mechanism > Computational Materials Science Software > Molecular Modeling.
Original language | English |
---|---|
Article number | e1458 |
Journal | Wiley Interdisciplinary Reviews: Computational Molecular Science |
Volume | 10 |
Issue number | 4 |
DOIs | |
State | Published - Jul 1 2020 |
Externally published | Yes |
Funding
This work is supported by the National Science Foundation (NSF) CAREER program (grant No. OAC‐1751161), and the New York State Center of Excellence in Materials Informatics (grants No. CMI‐1140384 and CMI‐1148092). Early work on was supported by start‐up funds provided through the University at Buffalo (UB). The deep eutectic solvent application study was funded by the Army Armament Research, Development and Engineering Center (ARDEC) SBIR program (grant No. W15QKN‐17‐C‐0078), and solubility parameter work by Toyota Motor Engineering and Manufacturing North America. is interfaced with the Open Chemistry platform and the MaDE@UB toolkit, and these efforts are supported by the Department of Energy SBIR program (grant No. DE‐SC0017193) and the NSF DIBBs program (grant No. OAC‐1640867), respectively. The DIBBs grant also funded the implementation of several methods of particular interest for MaDE@UB into , such as the Magpie library, the meta data parser, and standard DNNs using Keras. Computing time on the high‐performance computing clusters “,” “,” “,” and “” was provided by the UB Center for Computational Research (CCR). The work presented in this paper is a central part of M.H.'s PhD thesis. M.H. gratefully acknowledges support by Phase‐I and Phase‐II Software Fellowships (grant No. ACI‐1547580‐479590) of the NSF Molecular Sciences Software Institute (grant No. ACI‐1547580) at Virginia Tech. We thank the other members—past and present—of the Hachmann group as well as Profs. Venugopal Govindaraju and Krishna Rajan (both UB) for valuable discussions and insights that have helped guide the development of . ChemML ChemML ChemML Rush Alpha Beta Gamma ChemML Armament Research, Development and Engineering Center, Grant/Award Number: W15QKN‐17‐C‐0078; National Science Foundation, Grant/Award Numbers: ACI‐1547580, OAC‐1640867, OAC‐1751161; New York Center of Excellence in Materials Informatics, Grant/Award Numbers: CMI‐1140384, CMI‐1148092; Office of Science, Grant/Award Number: DE‐SC0017193 Funding information This work is supported by the National Science Foundation (NSF) CAREER program (grant No. OAC-1751161), and the New York State Center of Excellence in Materials Informatics (grants No. CMI-1140384 and CMI-1148092). Early work on ChemML was supported by start-up funds provided through the University at Buffalo (UB). The deep eutectic solvent application study was funded by the Army Armament Research, Development and Engineering Center (ARDEC) SBIR program (grant No. W15QKN-17-C-0078), and solubility parameter work by Toyota Motor Engineering and Manufacturing North America. ChemML is interfaced with the Open Chemistry platform and the MaDE@UB toolkit, and these efforts are supported by the Department of Energy SBIR program (grant No. DE-SC0017193) and the NSF DIBBs program (grant No. OAC-1640867), respectively. The DIBBs grant also funded the implementation of several methods of particular interest for MaDE@UB into ChemML, such as the Magpie library, the meta data parser, and standard DNNs using Keras. Computing time on the high-performance computing clusters ?Rush,? ?Alpha,? ?Beta,? and ?Gamma? was provided by the UB Center for Computational Research (CCR). The work presented in this paper is a central part of M.H.'s PhD thesis. M.H. gratefully acknowledges support by Phase-I and Phase-II Software Fellowships (grant No. ACI-1547580-479590) of the NSF Molecular Sciences Software Institute (grant No. ACI-1547580) at Virginia Tech. We thank the other members?past and present?of the Hachmann group as well as Profs. Venugopal Govindaraju and Krishna Rajan (both UB) for valuable discussions and insights that have helped guide the development of ChemML.
Funders | Funder number |
---|---|
Department of Energy SBIR | ACI‐1547580‐479590, DE‐SC0017193 |
NSF Molecular Sciences Software Institute | |
New York Center of Excellence in Materials Informatics | |
New York State Center of Excellence in Materials Informatics | CMI‐1140384, CMI‐1148092 |
UB Center for Computational Research | |
National Science Foundation | 1640867, OAC‐1640867, OAC‐1751161, 1751161, ACI‐1547580 |
Office of Science | |
University at Buffalo | |
Armament Research, Development and Engineering Center | W15QKN‐17‐C‐0078 |
Toyota Motor Engineering and Manufacturing North America |
Keywords
- data science
- data-driven research
- informatics
- machine learning
- program package