Community detection using low-dimensional network embedding algorithms

by   Aman Barot, et al.

With the increasing relevance of large networks in important areas such as the study of contact networks for spread of disease, or social networks for their impact on geopolitics, it has become necessary to study machine learning tools that are scalable to very large networks, often containing millions of nodes. One major class of such scalable algorithms is known as network representation learning or network embedding. These algorithms try to learn representations of network functionals (e.g. nodes) by first running multiple random walks and then using the number of co-occurrences of each pair of nodes in observed random walk segments to obtain a low-dimensional representation of nodes on some Euclidean space. The aim of this paper is to rigorously understand the performance of two major algorithms, DeepWalk and node2vec, in recovering communities for canonical network models with ground truth communities. Depending on the sparsity of the graph, we find the length of the random walk segments required such that the corresponding observed co-occurrence window is able to perform almost exact recovery of the underlying community assignments. We prove that, given some fixed co-occurrence window, node2vec using random walks with a low non-backtracking probability can succeed for much sparser networks compared to DeepWalk using simple random walks. Moreover, if the sparsity parameter is low, we provide evidence that these algorithms might not succeed in almost exact recovery. The analysis requires developing general tools for path counting on random networks having an underlying low-rank structure, which are of independent interest.


page 1

page 2

page 3

page 4


Consistency of random-walk based network embedding algorithms

Random-walk based network embedding algorithms like node2vec and DeepWal...

Synwalk – Community Detection via Random Walk Modelling

Complex systems, abstractly represented as networks, are ubiquitous in e...

dynnode2vec: Scalable Dynamic Network Embedding

Network representation learning in low dimensional vector space has attr...

Online Factorization and Partition of Complex Networks From Random Walks

Finding the reduced-dimensional structure is critical to understanding c...

Community detection in multiplex networks using locally adaptive random walks

Multiplex networks, a special type of multilayer networks, are increasin...

An Extension of InfoMap to Absorbing Random Walks

InfoMap is a popular approach for detecting densely connected "communiti...

Walk-and-Relate: A Random-Walk-based Algorithm for Representation Learning on Sparse Knowledge Graphs

Knowledge graph (KG) embedding techniques use structured relationships b...

Please sign up or login with your details

Forgot password? Click here to reset