Spatially Aggregated Gaussian Processes with Multivariate Areal Outputs
We propose a probabilistic model for inferring the multivariate function from multiple areal data sets with various granularities. Here, the areal data are observed not at location points but at regions. Existing regression-based models require the fine-grained auxiliary data sets on the same domain. With the proposed model, the functions for respective areal data sets are assumed to be a multivariate dependent Gaussian process (GP) that is modeled as a linear mixing of independent latent GPs. Sharing of latent GPs across multiple areal data sets allows us to effectively estimate spatial correlation for each areal data set; moreover it can easily be extended to transfer learning across multiple domains. To handle the multivariate areal data, we design its observation model with a spatial aggregation process for each areal data set, which is an integral of the mixed GP over the corresponding region. By deriving the posterior GP, we can predict the data value at any location point by considering the spatial correlations and the dependences between areal data sets simultaneously. Our experiments on real-world data sets demonstrate that our model can 1) accurately refine the coarse-grained areal data, and 2) offer performance improvements by using the areal data sets from multiple domains.
READ FULL TEXT