PREMA: Principled Tensor Data Recovery from Multiple Aggregated Views

by   Faisal M. Almutairi, et al.

Multidimensional data have become ubiquitous and are frequently involved in situations where the information is aggregated over multiple data atoms. The aggregation can be over time or other features, such as geographical location or group affiliation. We often have access to multiple aggregated views of the same data, each aggregated in one or more dimensions, especially when data are collected or measured by different agencies. However, data mining and machine learning models require detailed data for personalized analysis and prediction. Thus, data disaggregation algorithms are becoming increasingly important in various domains. The goal of this paper is to reconstruct finer-scale data from multiple coarse views, aggregated over different (subsets of) dimensions. The proposed method, called PREMA, leverages low-rank tensor factorization tools to provide recovery guarantees under certain conditions. PREMA is flexible in the sense that it can perform disaggregation on data that have missing entries, i.e., partially observed. The proposed method considers challenging scenarios: i) the available views of the data are aggregated in two dimensions, i.e., double aggregation, and ii) the aggregation patterns are unknown. Experiments on real data from different domains, i.e., sales data from retail companies, crime counts, and weather observations, are presented to showcase the effectiveness of PREMA.


page 1

page 2

page 3

page 4


GRATE: Granular Recovery of Aggregated Tensor Data by Example

In this paper, we address the challenge of recovering an accurate breakd...

MTC: Multiresolution Tensor Completion from Partial and Coarse Observations

Existing tensor completion formulation mostly relies on partial observat...

A General Model for Robust Tensor Factorization with Unknown Noise

Because of the limitations of matrix factorization, such as losing spati...

TenIPS: Inverse Propensity Sampling for Tensor Completion

Tensors are widely used to represent multiway arrays of data. The recove...

A Variational Information Bottleneck Approach to Multi-Omics Data Integration

Integration of data from multiple omics techniques is becoming increasin...

Exactly mergeable summaries

In the analysis of large/big data sets, aggregation (replacing values of...

Factor selection in screening experiments by aggregation over random models

Screening experiments are useful for screening out a small number of tru...

Please sign up or login with your details

Forgot password? Click here to reset