RETEXO: Scalable Neural Network Training over Distributed Graphs
Graph neural networks offer a promising approach to supervised learning over graph data. Graph data, especially when it is privacy-sensitive or too large to train on centrally, is often stored partitioned across disparate processing units (clients) which want to minimize the communication costs during collaborative training. The fully-distributed setup takes such partitioning to its extreme, wherein features of only a single node and its adjacent edges are kept locally with one client processor. Existing GNNs are not architected for training in such setups and incur prohibitive costs therein. We propose RETEXO, a novel transformation of existing GNNs that improves the communication efficiency during training in the fully-distributed setup. We experimentally confirm that RETEXO offers up to 6 orders of magnitude better communication efficiency even when training shallow GNNs, with a minimal trade-off in accuracy for supervised node classification tasks.
READ FULL TEXT