TY - JOUR AU - AB - Global Autoregressive Models for Data-Efficient Sequence Learning Tetiana Parshakova Jean-Marc Andreoli Marc Dymetman Stanford University Naver Labs Europe Naver Labs Europe tetianap@stanford.edu {jean-marc.andreoli,marc.dymetman}@naverlabs.com Abstract quences from the model distribution is directly ob- tained through a sequence of local sampling deci- Standard autoregressive seq2seq models are sions. easily trained by max-likelihood, but tend to However, these autoregressive models (AMs) show poor results under small-data condi- tend to suffer from a form of myopia. They have tions. We introduce a class of seq2seq mod- difficulty accounting for global properties of the els, GAMs (Global Autoregressive Models), predicted sequences, from overlooking certain as- which combine an autoregressive component pects of the semantic input in NLG to duplicating with a log-linear component, allowing the use of global a priori features to compensate for linguistic material or producing “hallucinations” lack of data. We train these models in two in MT, and generally through being unable to ac- steps. In the first step, we obtain an unnormal- count for long-distance consistency requirements ized GAM that maximizes the likelihood of the that would be obvious for a human reader. data, but is improper for fast inference or eval- The main contributions of this paper are as TI - Global Autoregressive Models for Data-Efficient Sequence Learning JF - Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL) DO - 10.18653/v1/k19-1084 DA - 2019-01-01 UR - https://www.deepdyve.com/lp/unpaywall/global-autoregressive-models-for-data-efficient-sequence-learning-D5cxajrB4X DP - DeepDyve ER -