In this work, we explore joint energy-based model (EBM) training during ...
Current summarization systems yield generic summaries that are disconnec...
With thousands of academic articles shared on a daily basis, it has beco...
Byte-pair encoding (BPE) is a ubiquitous algorithm in the subword
tokeni...
Class-conditional language models (CC-LMs) can be used to generate natur...
The scarcity of comprehensive up-to-date studies on evaluation metrics f...
Word embeddings derived from human-generated corpora inherit strong gend...
Task-oriented dialogue is often decomposed into three tasks: understandi...
Currently used metrics for assessing summarization algorithms do not acc...
Large-scale language models show promising text generation capabilities,...
Text summarization aims at compressing long documents into a shorter for...
Deep learning models perform poorly on tasks that require commonsense
re...
While natural language processing systems often focus on a single langua...
Even as pre-trained language encoders such as BERT are shared across man...
Deep learning has improved performance on many natural language processi...
Recurrent neural networks (RNNs) serve as a fundamental building block f...
Computer vision has benefited from initializing multiple deep layers wit...