Large Language Models (LLMs) went from non-existent to ubiquitous in the...
The computation necessary for training Transformer-based language models...
Data augmentation has been established as an efficacious approach to
sup...
The ever-growing diversity of pre-training text corpora has equipped lan...
We propose a continuous optimization framework for discovering a latent
...
Training vision or language models on large datasets can take days, if n...
Data-efficient learning algorithms are essential in many practical
appli...