Graph Learning with Loss-Guided Training
Classically, ML models trained with stochastic gradient descent (SGD) are designed to minimize the average loss per example and use a distribution of training examples that remains static in the course of training. Research in recent years demonstrated, empirically and theoretically, that significant acceleration is possible by methods that dynamically adjust the training distribution in the course of training so that training is more focused on examples with higher loss. We explore loss-guided training in a new domain of node embedding methods pioneered by DeepWalk. These methods work with implicit and large set of positive training examples that are generated using random walks on the input graph and therefore are not amenable for typical example selection methods. We propose computationally efficient methods that allow for loss-guided training in this framework. Our empirical evaluation on a rich collection of datasets shows significant acceleration over the baseline static methods, both in terms of total training performed and overall computation.
READ FULL TEXT