TY - JOUR AU - AB - (x, y)=(“A three-hour cinema master class.”, “It was great.”) We introduce a noisy channel approach for lan- Input Output P(y |x) Direct guage model prompting in few-shot text classi- A three-hour cinema master class. It was great. fication. Instead of computing the likelihood LM of the label given the input (referred as di- Channel P(x |y)P(y) ∝ P(x |y) rect models), channel models compute the con- It was great. A three-hour cinema master class. ditional probability of the input given the la- bel, and are thereby required to explain every Figure 1: An illustration of the direct model and the word in the input. We use channel models channel model for language model prompting in the for recently proposed few-shot learning meth- sentiment analysis task. ods with no or very limited updates to the lan- guage model parameters, via either in-context demonstration or prompt tuning. Our exper- with large language models, inspired by noisy chan- iments show that, for both methods, channel nel models in machine translation (Brown et al., models significantly outperform their direct 1993; Koehn et al., 2003; Yu et al., 2017; Yee et al., counterparts, which we attribute to their sta- 2019) and their extensions to TI - Noisy Channel Language Model Prompting for Few-Shot Text Classification JF - Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) DO - 10.18653/v1/2022.acl-long.365 DA - 2022-01-01 UR - https://www.deepdyve.com/lp/unpaywall/noisy-channel-language-model-prompting-for-few-shot-text-V96CU3TnC1 DP - DeepDyve ER -