TY - JOUR AU - AB - RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder 1† 2† 1∗ 2 Shitao Xiao , Zheng Liu , Yingxia Shao , Zhao Cao 1: Beijing University of Posts and Telecommunications, Beijing, China 2: Huawei Technologies Ltd. Co., Shenzhen, China {stxiao,shaoyx}@bupt.edu.cn, {liuzheng107,caozhao1}@huawei.com Abstract 2020; Luan et al., 2021). The mainstream mod- els, e.g., BERT (Devlin et al., 2019), RoBERTa Despite pre-training’s progress in many impor- (Liu et al., 2019), T5 (Raffel et al., 2019), are usu- tant NLP tasks, it remains to explore effec- ally pre-trained by token-level tasks, like MLM and tive pre-training strategies for dense retrieval. Seq2Seq. However, the sentence-level representa- In this paper, we propose RetroMAE, a new tion capability is not fully developed in these tasks, retrieval oriented pre-training paradigm based on Masked Auto-Encoder (MAE). RetroMAE which restricts their potential for dense retrieval. is highlighted by three critical designs. 1) Given the above defect, there have been in- A novel MAE workflow , where the input creasing interests to develop retrieval oriented pre- sentence is polluted for encoder and decoder trained models. One popular strategy is to lever- with different masks. The sentence embed- age self-contrastive learning (Chang et al., 2020; ding is generated from the TI - RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder JF - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing DO - 10.18653/v1/2022.emnlp-main.35 DA - 2022-01-01 UR - https://www.deepdyve.com/lp/unpaywall/retromae-pre-training-retrieval-oriented-language-models-via-masked-WNea8vecJe DP - DeepDyve ER -