Topic Analysis for Text with Side Data

03/01/2022
by   Biyi Fang, et al.
0

Although latent factor models (e.g., matrix factorization) obtain good performance in predictions, they suffer from several problems including cold-start, non-transparency, and suboptimal recommendations. In this paper, we employ text with side data to tackle these limitations. We introduce a hybrid generative probabilistic model that combines a neural network with a latent topic model, which is a four-level hierarchical Bayesian model. In the model, each document is modeled as a finite mixture over an underlying set of topics and each topic is modeled as an infinite mixture over an underlying set of topic probabilities. Furthermore, each topic probability is modeled as a finite mixture over side data. In the context of text, the neural network provides an overview distribution about side data for the corresponding text, which is the prior distribution in LDA to help perform topic grouping. The approach is evaluated on several different datasets, where the model is shown to outperform standard LDA and Dirichlet-multinomial regression (DMR) in terms of topic grouping, model perplexity, classification and comment generation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2016

On a Topic Model for Sentences

Probabilistic topic models are generative models that describe the conte...
research
07/11/2012

The Author-Topic Model for Authors and Documents

We introduce the author-topic model, a generative model for documents th...
research
04/26/2020

Neural Topic Modeling with Bidirectional Adversarial Training

Recent years have witnessed a surge of interests of using neural topic m...
research
04/23/2020

A Gamma-Poisson Mixture Topic Model for Short Text

Most topic models are constructed under the assumption that documents fo...
research
01/12/2015

Autodetection and Classification of Hidden Cultural City Districts from Yelp Reviews

Topic models are a way to discover underlying themes in an otherwise uns...
research
10/23/2014

Model Selection for Topic Models via Spectral Decomposition

Topic models have achieved significant successes in analyzing large-scal...
research
08/05/2015

Progressive EM for Latent Tree Models and Hierarchical Topic Detection

Hierarchical latent tree analysis (HLTA) is recently proposed as a new m...

Please sign up or login with your details

Forgot password? Click here to reset