Large language models (LLMs) are capable of performing conditional seque...
N-gram matching-based evaluation metrics, such as BLEU and chrF, are wid...
Recently, DeepNorm scales Transformers into extremely deep (i.e., 1000
l...
This paper introduces WeChat AI's participation in WMT 2021 shared news
...
We participate in the WMT 2020 shared news translation task on Chinese t...