Multivariate mixed membership modeling: Inferring domain-specific risk profiles

01/16/2019
by   Massimiliano Russo, et al.
0

Characterizing shared membership of individuals in two or more categories of a classification scheme poses severe interpretability problems when the number of categories is large (e.g. greater than six). Mixed membership models quantify this phenomenon, but they usually focus on the structure of the extreme profiles consistent with the given data, avoiding the characterization problem. Estimation methods yield models with good numerical fits and usually a number of profiles of 20s or higher. Resolution of the interpretability problem is facilitated by first partitioning the set of variables into distinct subject-matter-based domains. We then introduce a new class of multivariate mixed membership models that take explicit account of the blocks of variables corresponding to the distinct domains and a cross-domain correlation structure, which yields new information about shared membership of individuals in a complex classification scheme. We specify a multivariate logistic normal distribution for the membership vectors, which allows easy introduction of auxiliary information leveraging a latent multivariate logistic regression. A Bayesian approach to inference, relying on Polya gamma data augmentation, facilitates efficient posterior computation via Markov Chain Monte Carlo. We apply this methodology to a spatially explicit study of malaria risk over time on the Brazilian Amazon frontier.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset