Many natural language processing (NLP) tasks rely on labeled data to tra...
In this paper, we propose a large-scale language pre-training for text
G...
Most existing pre-trained language representation models (PLMs) are
sub-...
Knowledge distillation is an effective way to transfer knowledge from a
...