Evaluating algorithmic bias on biomarker classification of breast cancer pathology reports

Jordan Tschida, Mayanka Chandrashekar, Alina Peluso, Zachary Fox, Patrycja Krawczuk, Dakota Murdock, Xiao Cheng Wu, John Gounley, Heidi A. Hanson

Research output: Contribution to journalArticlepeer-review

Abstract

Objectives: This work evaluated algorithmic bias in biomarkers classification using electronic pathology reports from female breast cancer cases. Bias was assessed across 5 subgroups: cancer registry, race, Hispanic ethnicity, age at diagnosis, and socioeconomic status. Materials and Methods: We utilized 594 875 electronic pathology reports from 178 121 tumors diagnosed in Kentucky, Louisiana, New Jersey, New Mexico, Seattle, and Utah to train 2 deep-learning algorithms to classify breast cancer patients using their biomarkers test results. We used balanced error rate (BER), demographic parity (DP), equalized odds (EOD), and equal opportunity (EOP) to assess bias. Results: We found differences in predictive accuracy between registries, with the highest accuracy in the registry that contributed the most data (Seattle Registry, BER ratios for all registries >1.25). BER showed no significant algorithmic bias in extracting biomarkers (estrogen receptor, progesterone receptor, human epidermal growth factor receptor 2) for race, Hispanic ethnicity, age at diagnosis, or socioeconomic subgroups (BER ratio <1.25). DP, EOD, and EOP all showed insignificant results. Discussion: We observed significant differences in BER by registry, but no significant bias using the DP, EOD, and EOP metrics for sociodemographic or racial categories. This highlights the importance of employing a diverse set of metrics for a comprehensive evaluation of model fairness. Conclusion: A thorough evaluation of algorithmic biases that may affect equality in clinical care is a critical step before deploying algorithms in the real world. We found little evidence of algorithmic bias in our biomarker classification tool. Artificial intelligence tools to expedite information extraction from clinical records could accelerate clinical trial matching and improve care.

Original languageEnglish
Article numberooaf033
JournalJAMIA Open
Volume8
Issue number3
DOIs
StatePublished - Jun 1 2025

Funding

This work has been supported in part by the US Department of Energy (DOE) and the National Cancer Institute of the National Institutes of Health. This work was performed under the auspices of the DOE by Argonne National Laboratory under Contract DE-AC02-06-CH11357, Lawrence Livermore National Laboratory under Contract DEAC52-07NA27344, Los Alamos National Laboratory under Contract DE-AC5206NA25396, and Oak Ridge National Laboratory under Contract DE-AC05-00OR22725. The authors would like to acknowledge the contribution to this study from other staff in the participating central cancer registries. These registries are supported by the NCI\u2019s SEER program, the Centers for Disease Control and Prevention\u2019s National Program of Cancer Registries (NPCR), and/or state agencies, universities, and cancer centers. The participating central cancer registries include the following: This work has been supported in part by the US Department of Energy (DOE) and the National Cancer Institute of the National Institutes of Health. This work was performed under the auspices of the DOE by Argonne National Laboratory under Contract DE-AC02-06-CH11357, Lawrence Livermore National Laboratory under Contract DEAC52-07NA27344, Los Alamos National Laboratory under Contract DE-AC5206NA25396, and Oak Ridge National Laboratory under Contract DE-AC05-00OR22725. Office of Science of the US Department of Energy. This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan ( http://energy.gov/downloads/doe-public-access-plan ). The authors would like to acknowledge the contribution to this study from other staff in the participating central cancer registries. These registries are supported by the NCI\u2019s SEER program, the Centers for Disease Control and Prevention\u2019s National Program of Cancer Registries (NPCR), and/or state agencies, universities, and cancer centers. The participating central cancer registries include the following:

Keywords

  • algorithmic bias
  • artificial intelligence
  • biomarkers
  • breast cancer
  • population-level

Fingerprint

Dive into the research topics of 'Evaluating algorithmic bias on biomarker classification of breast cancer pathology reports'. Together they form a unique fingerprint.

Cite this