Variance and covariance of distributions on graphs
We develop a theory to measure the variance and covariance of probability distributions defined on the nodes of a graph, which takes into account the distance between nodes. Our approach generalizes the usual (co)variance to the setting of weighted graphs and retains many of its intuitive and desired properties. Interestingly, we find that a number of famous concepts in graph theory and network science can be reinterpreted in this setting as variances and covariances of particular distributions: we show this correspondence for Kemeny's constant, the Kirchhoff index, network modularity and Markov stability. As a particular application, we define the maximum-variance problem on graphs with respect to the effective resistance distance, and characterize the solutions to this problem both numerically and theoretically. We show how the maximum-variance distribution can be interpreted as a core-periphery measure, illustrated by the fact that these distributions are supported on the leaf nodes of tree graphs, low-degree nodes in a configuration-like graph and boundary nodes in random geometric graphs. Our theoretical results are supported by a number of experiments on a network of mathematical concepts, where we use the variance and covariance as analytical tools to study the (co-)occurrence of concepts in scientific papers with respect to the (network) relations between these concepts.
READ FULL TEXT