A Spectral Method for Identifiable Grade of Membership Analysis with Binary Responses

by   Ling Chen, et al.

Grade of Membership (GoM) models are popular individual-level mixture models for multivariate categorical data. GoM allows each subject to have mixed memberships in multiple extreme latent profiles. Therefore GoM models have a richer modeling capacity than latent class models that restrict each subject to belong to a single profile. The flexibility of GoM comes at the cost of more challenging identifiability and estimation problems. In this work, we propose a singular value decomposition (SVD) based spectral approach to GoM analysis with multivariate binary responses. Our approach hinges on the observation that the expectation of the data matrix has a low-rank decomposition under a GoM model. For identifiability, we develop sufficient and almost necessary conditions for a notion of expectation identifiability. For estimation, we extract only a few leading singular vectors of the observed data matrix, and exploit the simplex geometry of these vectors to estimate the mixed membership scores and other parameters. Our spectral method has a huge computational advantage over Bayesian or likelihood-based methods and is scalable to large-scale and high-dimensional data. Extensive simulation studies demonstrate the superior efficiency and accuracy of our method. We also illustrate our method by applying it to a personality test dataset.


page 18

page 21

page 25


Dimension-Grouped Mixed Membership Models for Multivariate Categorical Data

Mixed Membership Models (MMMs) are a popular family of latent structure ...

Flexible Regularized Estimation in High-Dimensional Mixed Membership Models

Mixed membership models are an extension of finite mixture models, where...

Multivariate mixed membership modeling: Inferring domain-specific risk profiles

Characterizing shared membership of individuals in two or more categorie...

Functional Partial Membership Models

Partial membership models, or mixed membership models, are a flexible un...

A general error analysis for randomized low-rank approximation methods

We propose a general error analysis related to the low-rank approximatio...

Learning a Latent Simplex in Input-Sparsity Time

We consider the problem of learning a latent k-vertex simplex K⊂ℝ^d, giv...

Incompletely observed nonparametric factorial designs with repeated measurements: A wild bootstrap approach

In many life science experiments or medical studies, subjects are repeat...

Please sign up or login with your details

Forgot password? Click here to reset