Privacy-Preserving Federated Learning for Science: Challenges and Research Directions

Kibaek Kim, Krishnan Raghavan, Olivera Kotevska, Matthieu Dorier, Ravi Madduri, Minseok Ryu, Todd Munson, Rob Ross, Thomas Flynn, Ai Kagawa, Byung Jun Yoon, Christian Engelmann, Farzad Yousefian

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper discusses the key challenges and future research directions for privacy-preserving federated learning (PPFL), with a focus on its application to large-scale scientific artificial intelligence models, in particular, foundation models (FMs). PPFL enables collaborative model training across distributed datasets while preserving privacy - an important collaborative approach for science. We discuss the need for efficient and scalable algorithms to address the increasing complexity of FMs, particularly when dealing with heterogeneous clients. In addition, we underscore the need for developing advance privacy-preserving techniques, such as differential privacy, to balance privacy and utility in large FMs emphasizing fairness and incentive mechanisms to ensure equitable participation among heterogeneous clients. Finally, we emphasize the need for a robust software stack supporting scalable and secure PPFL deployments across multiple high-performance computing facilities. We envision that PPFL would play a crucial role to advance scientific discovery and enable large-scale, privacy-aware collaborations across science domains.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE International Conference on Big Data, BigData 2024
EditorsWei Ding, Chang-Tien Lu, Fusheng Wang, Liping Di, Kesheng Wu, Jun Huan, Raghu Nambiar, Jundong Li, Filip Ilievski, Ricardo Baeza-Yates, Xiaohua Hu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages7849-7853
Number of pages5
ISBN (Electronic)9798350362480
DOIs
StatePublished - 2024
Event2024 IEEE International Conference on Big Data, BigData 2024 - Washington, United States
Duration: Dec 15 2024Dec 18 2024

Publication series

NameProceedings - 2024 IEEE International Conference on Big Data, BigData 2024

Conference

Conference2024 IEEE International Conference on Big Data, BigData 2024
Country/TerritoryUnited States
CityWashington
Period12/15/2412/18/24

Funding

This work was supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research, under Contract DE-AC02-06CH11357.

Keywords

  • distributed computing
  • federated learning
  • foundation models
  • privacy preservation

Fingerprint

Dive into the research topics of 'Privacy-Preserving Federated Learning for Science: Challenges and Research Directions'. Together they form a unique fingerprint.

Cite this