Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data

by   Xuhai Xu, et al.
UMass Lowell
Northeastern University
Rensselaer Polytechnic Institute
University of Washington
Stanford University

Advances in large language models (LLMs) have empowered a variety of applications. However, there is still a significant gap in research when it comes to understanding and enhancing the capabilities of LLMs in the field of mental health. In this work, we present the first comprehensive evaluation of multiple LLMs, including Alpaca, Alpaca-LoRA, FLAN-T5, GPT-3.5, and GPT-4, on various mental health prediction tasks via online text data. We conduct a broad range of experiments, covering zero-shot prompting, few-shot prompting, and instruction fine-tuning. The results indicate a promising yet limited performance of LLMs with zero-shot and few-shot prompt designs for the mental health tasks. More importantly, our experiments show that instruction finetuning can significantly boost the performance of LLMs for all tasks simultaneously. Our best-finetuned models, Mental-Alpaca and Mental-FLAN-T5, outperform the best prompt design of GPT-3.5 (25 and 15 times bigger) by 10.9 on balanced accuracy and the best of GPT-4 (250 and 150 times bigger) by 4.8 They further perform on par with the state-of-the-art task-specific language model. We also conduct an exploratory case study on LLMs' capability on the mental health reasoning tasks, illustrating the promising capability of certain models such as GPT-4. We summarize our findings into a set of action guidelines for potential methods to enhance LLMs' capability for mental health tasks. Meanwhile, we also emphasize the important limitations before achieving deployability in real-world mental health settings, such as known racial and gender bias. We highlight the important ethical risks accompanying this line of research.


page 1

page 2

page 3

page 4


Dynamic Strategy Chain: Dynamic Zero-Shot CoT for Long Mental Health Support Generation

Long counseling Text Generation for Mental health support (LTGM), an inn...

On the Evaluations of ChatGPT and Emotion-enhanced Prompting for Mental Health Analysis

Automated mental health analysis shows great potential for enhancing the...

Large Language Models are Few-Shot Health Learners

Large language models (LLMs) can capture rich representations of concept...

Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech

Text language models have shown remarkable zero-shot capability in gener...

DREAM: Uncovering Mental Models behind Language Models

To what extent do language models (LMs) build "mental models" of a scene...

GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities

The global economy is increasingly dependent on knowledge workers to mee...

Psy-LLM: Scaling up Global Mental Health Psychological Services with AI-based Large Language Models

The demand for psychological counseling has grown significantly in recen...

Please sign up or login with your details

Forgot password? Click here to reset