Modular Federated Learning for Non-IID Data

Beakal Gizachew (PhD)Samuel Hailemariam2025-07-302025-07-302025-06https://etd.aau.edu.et/handle/123456789/5871Federated Learning (FL) promises privacy-preserving collaboration across distributed clients but is hampered by three key challenges: severe accuracy degradation under non-IID data, high communication and computational demands on edge devices, and a lack of built-in explainability for debugging, user trust, and regulatory compliance. To bridge this gap, we propose two modular FL pipelines—SPATL-XL and SPATL-XLC—that integrate SHAP-driven pruning with, in the latter, dynamic client clustering. SPATL-XL applies SHAP-based pruning to the largest layers, removing low-impact parameters to both reduce model size and sharpen interpretability, whereas SPATL-XLC further groups clients via lightweight clustering to reduce communication overhead and smooth convergence in low-bandwidth, high-client settings. In experiments on CIFAR-10 and Fashion-MNIT over 200 communication rounds under IID and Dirichlet non-IID splits, our pipelines lower per-round communication to 13.26 MB, speed up end-to-end training by 1.13×, raise explanation fidelity from 30–50% to 89%, match or closely approach SCAFFOLD’s 70.64% top-1 accuracy (SPATL-XL: 70.36%), and maintain stable clustering quality (Silhouette, CHI, DBI) even when only 40–70% of clients participate. These results demonstrate that combining explainability-driven pruning with adaptive clustering yields practical, communication-efficient, and regulation-ready FL pipelines that simultaneously address non-IID bias, resource constraints, and transparency requirements.en-USAddis Ababa UniversityModular Federated Learning for Non-IID DataThesis