b'Madian Khabsa'

research

∙ 08/31/2023

The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants

We present Belebele, a multiple-choice machine reading comprehension (MR...

0 Lucas Bandarkar, et al. ∙

research

∙ 07/18/2023

Llama 2: Open Foundation and Fine-Tuned Chat Models

In this work, we develop and release Llama 2, a collection of pretrained...

0 Hugo Touvron, et al. ∙

research

∙ 05/06/2023

Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization

Prompt tuning is one of the successful approaches for parameter-efficien...

3 Anastasia Razdaibiedina, et al. ∙

research

∙ 04/01/2023

SVT: Supertoken Video Transformer for Efficient Video Understanding

Whether by processing videos with fixed resolution from start to end or ...

5 Chenbin Pan, et al. ∙

research

∙ 01/29/2023

Progressive Prompts: Continual Learning for Language Models

We introduce Progressive Prompts - a simple and efficient approach for c...

0 Anastasia Razdaibiedina, et al. ∙

research

∙ 01/25/2023

XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models

Large multilingual language models typically rely on a single vocabulary...

3 Davis Liang, et al. ∙

research

∙ 12/10/2022

Uniform Masking Prevails in Vision-Language Pretraining

Masked Language Modeling (MLM) has proven to be an essential component o...

0 Siddharth Verma, et al. ∙

research

∙ 05/25/2022

Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI

Evaluating an explanation's faithfulness is desired for many reasons suc...

0 Suzanna Sia, et al. ∙

research

∙ 12/27/2021

A Fistful of Words: Learning Transferable Visual Models from Bag-of-Words Supervision

Using natural language as a supervision for training visual recognition ...

2 Ajinkya Tejankar, et al. ∙

research

∙ 12/06/2021

Quantifying Adaptability in Pre-trained Language Models with 500 Tasks

When a neural language model (LM) is adapted to perform a new task, what...

1 Belinda Z. Li, et al. ∙

research

∙ 10/16/2021

Sparse Distillation: Speeding Up Text Classification by Using Bigger Models

Distilling state-of-the-art transformer models into lightweight student ...

0 Qinyuan Ye, et al. ∙

research

∙ 10/14/2021

UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning

Conventional fine-tuning of pre-trained language models tunes all model ...

1 Yuning Mao, et al. ∙

research

∙ 04/29/2021

Entailment as Few-Shot Learner

Large pre-trained language models (LMs) have demonstrated remarkable abi...

0 Sinong Wang, et al. ∙

research

∙ 04/18/2021

On the Influence of Masking Policies in Intermediate Pre-training

Current NLP models are predominantly trained through a pretrain-then-fin...

0 Qinyuan Ye, et al. ∙

research

∙ 04/12/2021

On Unifying Misinformation Detection

In this paper, we introduce UnifiedM2, a general-purpose misinformation ...

0 Nayeon Lee, et al. ∙

research

∙ 03/17/2021

Towards Few-Shot Fact-Checking via Perplexity

Few-shot learning has drawn researchers' attention to overcome the probl...

12 Nayeon Lee, et al. ∙

research

∙ 12/31/2020

Studying Strategically: Learning to Mask for Closed-book QA

Closed-book question-answering (QA) is a challenging task that requires ...

0 Qinyuan Ye, et al. ∙

research

∙ 12/31/2020

CLEAR: Contrastive Learning for Sentence Representation

Pre-trained language models have proven their unique powers in capturing...

0 Zhuofeng Wu, et al. ∙

research

∙ 06/15/2020

To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on Resource Rich Tasks

Pretraining NLP models with variants of Masked Language Model (MLM) obje...

0 Sinong Wang, et al. ∙

research

∙ 06/08/2020

Linformer: Self-Attention with Linear Complexity

Large transformer models have shown extraordinary success in achieving s...

17 Sinong Wang, et al. ∙

research

∙ 06/07/2020

Language Models as Fact Checkers?

Recent work has suggested that language models (LMs) store both common-s...

0 Nayeon Lee, et al. ∙

research

∙ 06/12/2019

Keeping Notes: Conditional Natural Language Generation with a Scratchpad Mechanism

We introduce the Scratchpad Mechanism, a novel addition to the sequence-...

0 Ryan Y. Benmalek, et al. ∙

research

∙ 04/22/2018

Adversarial Training for Community Question Answer Selection Based on Multi-scale Matching

Community-based question answering (CQA) websites represent an important...

0 Xiao Yang, et al. ∙

research

∙ 12/26/2017

Actionable Email Intent Modeling with Reparametrized RNNs

Emails in the workplace are often intentional calls to action for its re...

0 Chu-Cheng Lin, et al. ∙

Madian Khabsa

Featured Co-authors

Sign in with Google

Consider DeepAI Pro