Pre-training Polish Transformer-based Language Models at Scale

06/07/2020
by   Sławomir Dadas, et al.
0

Transformer-based language models are now widely used in Natural Language Processing (NLP). This statement is especially true for English language, in which many pre-trained models utilizing transformer-based architecture have been published in recent years. This has driven forward the state of the art for a variety of standard NLP tasks such as classification, regression, and sequence labeling, as well as text-to-text tasks, such as machine translation, question answering, or summarization. The situation have been different for low-resource languages, such as Polish, however. Although some transformer-based language models for Polish are available, none of them have come close to the scale, in terms of corpus size and the number of parameters, of the largest English-language models. In this study, we present two language models for Polish based on the popular BERT architecture. The larger model was trained on a dataset consisting of over 1 billion polish sentences, or 135GB of raw text. We describe our methodology for collecting the data, preparing the corpus, and pre-training the model. We then evaluate our models on thirteen Polish linguistic tasks, and demonstrate improvements over previous approaches in eleven of them.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2021

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

Large, pre-trained transformer-based language models such as BERT have d...
research
10/09/2022

Spread Love Not Hate: Undermining the Importance of Hateful Pre-training for Hate Speech Detection

Pre-training large neural language models, such as BERT, has led to impr...
research
05/23/2023

Pre-training Language Models for Comparative Reasoning

In this paper, we propose a novel framework to pre-train language models...
research
07/17/2022

Effectiveness of French Language Models on Abstractive Dialogue Summarization Task

Pre-trained language models have established the state-of-the-art on var...
research
05/07/2021

Understanding by Understanding Not: Modeling Negation in Language Models

Negation is a core construction in natural language. Despite being very ...
research
02/28/2023

H-AES: Towards Automated Essay Scoring for Hindi

The use of Natural Language Processing (NLP) for Automated Essay Scoring...
research
02/14/2020

Stress Test Evaluation of Transformer-based Models in Natural Language Understanding Tasks

There has been significant progress in recent years in the field of Natu...

Please sign up or login with your details

Forgot password? Click here to reset