The advent of large language models trained on code (code LLMs) has led ...
Recent work has shown that fine-tuning large pre-trained language models...
Scaling up language models has led to unprecedented performance gains, b...
Large language models (LLMs) have exhibited remarkable capabilities in
l...
Pre-trained masked language models successfully perform few-shot learnin...
Hate speech detection is complex; it relies on commonsense reasoning,
kn...
Prior work on language model pre-training has explored different
archite...
All-MLP architectures have attracted increasing interest as an alternati...
Mixture of Experts layers (MoEs) enable efficient scaling of language mo...
State-of-the-art natural language understanding classification models fo...
Unsupervised pre-training has led to much recent progress in natural lan...
The structured representation for semantic parsing in task-oriented assi...
Online social networks provide a platform for sharing information and fr...
We present BART, a denoising autoencoder for pretraining sequence-to-seq...