EduBERT: Pretrained Deep Language Models for Learning Analytics

12/02/2019
by   Benjamin Clavié, et al.
0

The use of large pretrained neural networks to create contextualized word embeddings has drastically improved performance on several natural language processing (NLP) tasks. These computationally expensive models have begun to be applied to domain-specific NLP tasks such as re-hospitalization prediction from clinical notes. This paper demonstrates that using large pretrained models produces excellent results on common learning analytics tasks. Pre-training deep language models using student forum data from a wide array of online courses improves performance beyond the state of the art on three text classification tasks. We also show that a smaller, distilled version of our model produces the best results on two of the three tasks while limiting computational cost. We make both models available to the research community at large.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2020

FinBERT: A Pretrained Language Model for Financial Communications

Contextual pretrained language models, such as BERT (Devlin et al., 2019...
research
05/24/2022

BabyBear: Cheap inference triage for expensive language models

Transformer language models provide superior accuracy over previous mode...
research
02/25/2021

Automated essay scoring using efficient transformer-based language models

Automated Essay Scoring (AES) is a cross-disciplinary effort involving E...
research
08/01/2017

Learned in Translation: Contextualized Word Vectors

Computer vision has benefited from initializing multiple deep layers wit...
research
04/04/2023

PromptAid: Prompt Exploration, Perturbation, Testing and Iteration using Visual Analytics for Large Language Models

Large Language Models (LLMs) have gained widespread popularity due to th...
research
08/18/2021

AdapterHub Playground: Simple and Flexible Few-Shot Learning with Adapters

The open-access dissemination of pretrained language models through onli...
research
07/11/2022

Embedding Recycling for Language Models

Training and inference with large neural models is expensive. However, f...

Please sign up or login with your details

Forgot password? Click here to reset