OpenAssistant Conversations – Democratizing Large Language Model Alignment

04/14/2023
by   Andreas Kopf, et al.
0

Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages distributed across 66,497 conversation trees, in 35 different languages, annotated with 461,292 quality ratings. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. To demonstrate the OpenAssistant Conversations dataset's effectiveness, we present OpenAssistant, the first fully open-source large-scale instruction-tuned model to be trained on human data. A preference study revealed that OpenAssistant replies are comparably preferred to GPT-3.5-turbo (ChatGPT) with a relative winrate of 48.3 respectively. We release our code and data under fully permissive licenses.

READ FULL TEXT

page 3

page 10

page 19

page 21

page 22

page 23

research
05/18/2023

LIMA: Less Is More for Alignment

Large language models are trained in two stages: (1) unsupervised pretra...
research
12/15/2021

Human Languages with Greater Information Density Increase Communication Speed, but Decrease Conversation Breadth

Language is the primary medium through which human information is commun...
research
07/01/2020

Iterative Paraphrastic Augmentation with Discriminative Span Alignment

We introduce a novel paraphrastic augmentation strategy based on sentenc...
research
05/31/2021

Picking Pearl From Seabed: Extracting Artefacts from Noisy Issue Triaging Collaborative Conversations for Hybrid Cloud Services

Site Reliability Engineers (SREs) play a key role in issue identificatio...
research
05/13/2020

Large Scale Multi-Actor Generative Dialog Modeling

Non-goal oriented dialog agents (i.e. chatbots) aim to produce varying a...
research
05/23/2023

Enhancing Chat Language Models by Scaling High-quality Instructional Conversations

Fine-tuning on instruction data has been widely validated as an effectiv...
research
07/08/2023

Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators

Large language models that exhibit instruction-following behaviour repre...

Please sign up or login with your details

Forgot password? Click here to reset