Abstract
This review aims to draw attention to two issues of concern when we set out to make machine learning work in the chemical and materials domain, that is, statistical loss function metrics for the validation and benchmarking of data-derived models, and the uncertainty quantification of predictions made by them. They are often overlooked or underappreciated topics as chemists typically only have limited training in statistics. Aside from helping to assess the quality, reliability, and applicability of a given model, these metrics are also key to comparing the performance of different models and thus for developing guidelines and best practices for the successful application of machine learning in chemistry.
Original language | English |
---|---|
Pages (from-to) | 146-156 |
Number of pages | 11 |
Journal | Trends in Chemistry |
Volume | 3 |
Issue number | 2 |
DOIs | |
State | Published - Feb 2021 |
Externally published | Yes |
Funding
This work was supported by the NSF CAREER program under grant No. OAC-1751161 and the NSF Big Data Spokes program under grant No. IIS-1761990 .
Keywords
- benchmarking
- machine learning
- model validation
- statistical loss function
- uncertainty quantification