Small-Bench NLP: Benchmark for small single GPU trained models in Natural Language Processing

09/22/2021
by   Kamal Raj Kanakarajan, et al.
0

Recent progress in the Natural Language Processing domain has given us several State-of-the-Art (SOTA) pretrained models which can be finetuned for specific tasks. These large models with billions of parameters trained on numerous GPUs/TPUs over weeks are leading in the benchmark leaderboards. In this paper, we discuss the need for a benchmark for cost and time effective smaller models trained on a single GPU. This will enable researchers with resource constraints experiment with novel and innovative ideas on tokenization, pretraining tasks, architecture, fine tuning methods etc. We set up Small-Bench NLP, a benchmark for small efficient neural language models trained on a single GPU. Small-Bench NLP benchmark comprises of eight NLP tasks on the publicly available GLUE datasets and a leaderboard to track the progress of the community. Our ELECTRA-DeBERTa (15M parameters) small model architecture achieves an average score of 81.53 which is comparable to that of BERT-Base's 82.20 (110M parameters). Our models, code and leaderboard are available at https://github.com/smallbenchnlp

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/14/2023

MatSci-NLP: Evaluating Scientific Language Models on Materials Science Language Tasks Using Text-to-Schema Modeling

We present MatSci-NLP, a natural language benchmark for evaluating the p...
research
02/09/2022

pNLP-Mixer: an Efficient all-MLP Architecture for Language

Large pre-trained language models drastically changed the natural langua...
research
04/05/2021

What's the best place for an AI conference, Vancouver or ______: Why completing comparative questions is difficult

Although large neural language models (LMs) like BERT can be finetuned t...
research
11/08/2019

ERASER: A Benchmark to Evaluate Rationalized NLP Models

State-of-the-art models in NLP are now predominantly based on deep neura...
research
02/14/2023

Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models

Pretrained large language models have become indispensable for solving v...
research
08/23/2023

Prompt2Model: Generating Deployable Models from Natural Language Instructions

Large language models (LLMs) enable system builders today to create comp...
research
10/11/2022

BanglaParaphrase: A High-Quality Bangla Paraphrase Dataset

In this work, we present BanglaParaphrase, a high-quality synthetic Bang...

Please sign up or login with your details

Forgot password? Click here to reset