TY - JOUR AU - AB - Martin Fajcik, Martin Docekal, Karel Ondrej, Pavel Smrz Brno University of Technology {ifajcik,idocekal,ondrej,smrz}@fit.vutbr.cz Abstract ing large quantities of retrieved passages Izacard and Grave (2021). They compensate for a certain This work presents a novel four-stage open- amount of the retrieval error and enable early ag- domain QA pipeline R2-D2 (R ANK TW ICE, gregation of answer’s evidence between passages. READ TW ICE). The pipeline is composed of This work demonstrates the relative improve- a retriever, passage reranker, extractive reader, generative reader and a mechanism that ag- ment of 23-32% compared to last year’s state-of- gregates the final prediction from all system’s the-art DPR system (Karpukhin et al., 2020), while components. We demonstrate its strength using the same knowledge source and the retriever. across three open-domain QA datasets: Natu- We propose a state-of-the-art Open-QA baseline ralQuestions, TriviaQA and EfficientQA, sur- composed of retriever, passage reranker, extractive passing state-of-the-art on the first two. Our reader, generative reader, and a novel component analysis demonstrates that: (i) combining ex- fusion approach. We follow the practice from infor- tractive and generative reader yields absolute mation retrieval and show that our moderately sized improvements up to 5 exact match and it is at least twice TI - R2-D2: A Modular Baseline for Open-Domain Question Answering JF - Findings of the Association for Computational Linguistics: EMNLP 2021 DO - 10.18653/v1/2021.findings-emnlp.73 DA - 2021-01-01 UR - https://www.deepdyve.com/lp/unpaywall/r2-d2-a-modular-baseline-for-open-domain-question-answering-QsuCPmaIS0 DP - DeepDyve ER -