D-Cliques: Compensating NonIIDness in Decentralized Federated Learning with Topology
The convergence speed of machine learning models trained with Federated Learning is significantly affected by non-independent and identically distributed (non-IID) data partitions, even more so in a fully decentralized setting without a central server. In this paper, we show that the impact of local class bias, an important type of data non-IIDness, can be significantly reduced by carefully designing the underlying communication topology. We present D-Cliques, a novel topology that reduces gradient bias by grouping nodes in interconnected cliques such that the local joint distribution in a clique is representative of the global class distribution. We also show how to adapt the updates of decentralized SGD to obtain unbiased gradients and implement an effective momentum with D-Cliques. Our empirical evaluation on MNIST and CIFAR10 demonstrates that our approach provides similar convergence speed as a fully-connected topology with a significant reduction in the number of edges and messages. In a 1000-node topology, D-Cliques requires 98 edges and 96 small-world topology across cliques.
READ FULL TEXT 
  
  
     share
 share