TY - GEN
T1 - Improving Federated Learning Through Low-Entropy Client Sampling Based on Learned High-Level Features
AU - Abebe, Waqwoya
AU - Munoz, Pablo
AU - Jannesari, Ali
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Data heterogeneity impacts the performance of Federated Learning (FL) by introducing training noise. Although representative client sampling can help mitigate the issue, it remains challenging to implement without compromising data privacy. This work introduces a new method to address the problem by proposing an affordable blind (privacy preserving) clustering mechanism for conducting stratified client sampling. Inspired by the 'dialect quiz', we propose a 'response test' to cluster clients whose models have learned similar high-level features. This approach facilitates representative client sampling without the need for direct access to client data. We demonstrate empirically that our method yields client samples with low relative entropy with respect to the global data distribution, indicating increased representativeness. Convergence experiments reveal that applying our method significantly improves the convergence and accuracy of the global model compared to strong baselines like SCAFFOLD and FL-CIR. Additionally, the reduced number of training rounds required to achieve target accuracy leads to decreased communication overhead and computational expense, making our approach promising for practical FL implementations.
AB - Data heterogeneity impacts the performance of Federated Learning (FL) by introducing training noise. Although representative client sampling can help mitigate the issue, it remains challenging to implement without compromising data privacy. This work introduces a new method to address the problem by proposing an affordable blind (privacy preserving) clustering mechanism for conducting stratified client sampling. Inspired by the 'dialect quiz', we propose a 'response test' to cluster clients whose models have learned similar high-level features. This approach facilitates representative client sampling without the need for direct access to client data. We demonstrate empirically that our method yields client samples with low relative entropy with respect to the global data distribution, indicating increased representativeness. Convergence experiments reveal that applying our method significantly improves the convergence and accuracy of the global model compared to strong baselines like SCAFFOLD and FL-CIR. Additionally, the reduced number of training rounds required to achieve target accuracy leads to decreased communication overhead and computational expense, making our approach promising for practical FL implementations.
KW - Client Sampling
KW - Commu-nication Efficiency
KW - Federated Learning
KW - Performance Gain
UR - http://www.scopus.com/inward/record.url?scp=85203270368&partnerID=8YFLogxK
U2 - 10.1109/CLOUD62652.2024.00013
DO - 10.1109/CLOUD62652.2024.00013
M3 - Conference contribution
AN - SCOPUS:85203270368
T3 - IEEE International Conference on Cloud Computing, CLOUD
SP - 20
EP - 29
BT - Proceedings - 2024 IEEE 17th International Conference on Cloud Computing, CLOUD 2024
A2 - Chang, Rong N.
A2 - Chang, Carl K.
A2 - Yang, Jingwei
A2 - Atukorala, Nimanthi
A2 - Jin, Zhi
A2 - Sheng, Michael
A2 - Fan, Jing
A2 - Fletcher, Kenneth
A2 - He, Qiang
A2 - Kosar, Tevfik
A2 - Sarkar, Santonu
A2 - Venkateswaran, Sreekrishnan
A2 - Wang, Shangguang
A2 - Liu, Xuanzhe
A2 - Seelam, Seetharami
A2 - Narayanaswami, Chandra
A2 - Zong, Ziliang
PB - IEEE Computer Society
T2 - 17th IEEE International Conference on Cloud Computing, CLOUD 2024
Y2 - 7 July 2024 through 13 July 2024
ER -