Low-Norm Graph Embedding
Learning distributed representations for nodes in graphs has become an important problem that underpins a wide spectrum of applications. Existing methods to this problem learn representations by optimizing a softmax objective while constraining the dimension of embedding vectors. We argue that the generalization performance of these methods are probably not due to the dimensionality constraint as commonly believed, but rather the small norm of embedding vectors. Both theoretical and empirical evidences are provided to support this argument: (a) we prove that the generalization error of these methods can be bounded regardless of embedding dimension by limiting the norm of vectors; (b) we show empirically that the generalization performance of existing embedding methods are likely due to the early stopping of stochastic gradient descent. Motivated by our analysis, we propose a new low-norm formulation of the graph embedding problem, which seeks to preserve graph structures while constraining the total squared l_2 norm of embedding vectors. With extensive experiments, we demonstrate that the empirical performance of the proposed method well backs our theoretical analysis. Furthermore, it notably outperforms state-of-the-art graph embedding methods in the tasks of link prediction and node classification.
READ FULL TEXT