Recent years have seen increasing concerns about the private inference o...
Recent years have seen increasing concerns about the unsafe response
gen...
Ad relevance modeling plays a critical role in online advertising system...
Non-Autoregressive generation is a sequence generation paradigm, which
r...
Sparsely activated models (SAMs), such as Mixture-of-Experts (MoE), can
...
Pre-trained language models have led to substantial gains over a broad r...
Transformer model with multi-head attention requires caching intermediat...
Now, the pre-training technique is ubiquitous in natural language proces...
Transformer is an attention-based neural network, which consists of two
...
Text encoders based on C-DSSM or transformers have demonstrated strong
p...
Commonsense generation aims at generating plausible everyday scenario
de...
In a sponsored search engine, generative retrieval models are recently
p...
Large-scale pre-trained models have attracted extensive attention in the...
This paper examines the challenging problem of learning representations ...
Pre-trained language models like BERT have achieved great success in a w...
In this paper, we present a new sequence-to-sequence pre-training model
...
Information extraction and user intention identification are central top...
The problem of optimizing unknown costly-to-evaluate functions has been
...