Sparsemax and Relaxed Wasserstein for Topic Sparsity

10/22/2018
by   Tianyi Lin, et al.
0

Topic sparsity refers to the observation that individual documents usually focus on several salient topics instead of covering a wide variety of topics, and a real topic adopts a narrow range of terms instead of a wide coverage of the vocabulary. Understanding this topic sparsity is especially important for analyzing user-generated web content and social media, which are featured in the form of extremely short posts and discussions. As topic sparsity of individual documents in online social media increases, so does the difficulty of analyzing the online text sources using traditional methods. In this paper, we propose two novel neural models by providing sparse posterior distributions over topics based on the Gaussian sparsemax construction, enabling efficient training by stochastic backpropagation. We construct an inference network conditioned on the input data and infer the variational distribution with the relaxed Wasserstein (RW) divergence. Unlike existing works based on Gaussian softmax construction and Kullback-Leibler (KL) divergence, our approaches can identify latent topic sparsity with training stability, predictive performance, and topic coherence. Experiments on different genres of large text corpora have demonstrated the effectiveness of our models as they outperform both probabilistic and neural methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2019

Topic-Aware Neural Keyphrase Generation for Social Media Language

A huge volume of user-generated content is daily produced on social medi...
research
06/01/2017

Discovering Discrete Latent Topics with Neural Variational Inference

Topic models have been widely explored as probabilistic generative model...
research
08/11/2020

Context Reinforced Neural Topic Modeling over Short Texts

As one of the prevalent topic mining tools, neural topic modeling has at...
research
09/20/2022

Twitter Topic Classification

Social media platforms host discussions about a wide variety of topics t...
research
09/19/2018

Modeling Online Discourse with Coupled Distributed Topics

In this paper, we propose a deep, globally normalized topic model that i...
research
06/15/2021

Author Clustering and Topic Estimation for Short Texts

Analysis of short text, such as social media posts, is extremely difficu...
research
02/06/2020

Conversational Structure Aware and Context Sensitive Topic Model for Online Discussions

Millions of online discussions are generated everyday on social media pl...

Please sign up or login with your details

Forgot password? Click here to reset