Gaussian Latent Dirichlet Allocation for Discrete Human State Discovery

by   Congyu Wu, et al.

In this article we propose and validate an unsupervised probabilistic model, Gaussian Latent Dirichlet Allocation (GLDA), for the problem of discrete state discovery from repeated, multivariate psychophysiological samples collected from multiple, inherently distinct, individuals. Psychology and medical research heavily involves measuring potentially related but individually inconclusive variables from a cohort of participants to derive diagnosis, necessitating clustering analysis. Traditional probabilistic clustering models such as Gaussian Mixture Model (GMM) assume a global mixture of component distributions, which may not be realistic for observations from different patients. The GLDA model borrows the individual-specific mixture structure from a popular topic model Latent Dirichlet Allocation (LDA) in Natural Language Processing and merges it with the Gaussian component distributions of GMM to suit continuous type data. We implemented GLDA using STAN (a probabilistic modeling language) and applied it on two datasets, one containing Ecological Momentary Assessments (EMA) and the other heart measures from electrocardiogram and impedance cardiograph. We found that in both datasets the GLDA-learned class weights achieved significantly higher correlations with clinically assessed depression, anxiety, and stress scores than those produced by the baseline GMM. Our findings demonstrate the advantage of GLDA over conventional finite mixture models for human state discovery from repeated multivariate data, likely due to better characterization of potential underlying between-participant differences. Future work is required to validate the utility of this model on a broader range of applications.


page 1

page 2

page 3

page 4


Kernel Topic Models

Latent Dirichlet Allocation models discrete data as a mixture of discret...

Multidimensional Membership Mixture Models

We present the multidimensional membership mixture (M3) models where eve...

On the Variational Posterior of Dirichlet Process Deep Latent Gaussian Mixture Models

Thanks to the reparameterization trick, deep latent Gaussian models have...

An Imputation model by Dirichlet Process Mixture of Elliptical Copulas for Data of Mixed Type

Copula-based methods provide a flexible approach to build missing data i...

A Statistical Model of Serve Return Impact Patterns in Professional Tennis

The spread in the use of tracking systems in sport has made fine-grained...

Conditional Hierarchical Bayesian Tucker Decomposition

Our research focuses on studying and developing methods for reducing the...

Regression on imperfect class labels derived by unsupervised clustering

Outcome regressed on class labels identified by unsupervised clustering ...

Please sign up or login with your details

Forgot password? Click here to reset