Feature Studies to Inform the Classification of Depressive Symptoms from Twitter Data for Population Health

01/28/2017
by   Danielle Mowery, et al.
0

The utility of Twitter data as a medium to support population-level mental health monitoring is not well understood. In an effort to better understand the predictive power of supervised machine learning classifiers and the influence of feature sets for efficiently classifying depression-related tweets on a large-scale, we conducted two feature study experiments. In the first experiment, we assessed the contribution of feature groups such as lexical information (e.g., unigrams) and emotions (e.g., strongly negative) using a feature ablation study. In the second experiment, we determined the percentile of top ranked features that produced the optimal classification performance by applying a three-step feature elimination approach. In the first experiment, we observed that lexical features are critical for identifying depressive symptoms, specifically for depressed mood (-35 points) and for disturbed sleep (-43 points). In the second experiment, we observed that the optimal F1-score performance of top ranked features in percentiles variably ranged across classes e.g., fatigue or loss of energy (5th percentile, 288 features) to depressed mood (55th percentile, 3,168 features) suggesting there is no consistent count of features for predicting depressive-related tweets. We conclude that simple lexical features and reduced feature sets can produce comparable results to larger feature sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2018

Automatically Detecting Self-Reported Birth Defect Outcomes on Twitter for Large-scale Epidemiological Research

In recent work, we identified and studied a small cohort of Twitter user...
research
04/10/2019

Deep Neural Networks Ensemble for Detecting Medication Mentions in Tweets

Objective: After years of research, Twitter posts are now recognized as ...
research
05/19/2016

Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages

Microblogging platforms such as Twitter provide active communication cha...
research
06/02/2021

Quantifying language changes surrounding mental health on Twitter

Mental health challenges are thought to afflict around 10 population eac...
research
11/08/2016

Veracity Computing from Lexical Cues and Perceived Certainty Trends

We present a data-driven method for determining the veracity of a set of...
research
11/26/2019

Tracing State-Level Obesity Prevalence from Sentence Embeddings of Tweets: A Feasibility Study

Twitter data has been shown broadly applicable for public health surveil...
research
10/22/2018

Predictive Linguistic Features of Schizophrenia

Schizophrenia is one of the most disabling and difficult to treat of all...

Please sign up or login with your details

Forgot password? Click here to reset