TY - JOUR AU - AB - The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20) 1,2,3,4∗ 2 1,3† 2 1,3,4,5 2 Feilong Chen, Fandong Meng, Jiaming Xu, Peng Li, Bo Xu, Jie Zhou Institute of Automation, Chinese Academy of Sciences (CASIA), Beijing, China. Pattern Recognition Center, WeChat AI, Tencent Inc., China Research Center for Brain-inspired Intelligence, CASIA University of Chinese Academy of Sciences Center for Excellence in Brain Science and Intelligence Technology, CAS. China {chenfeilong2018, jiaming.xu, xubo}@ia.ac.cn {fandongmeng, patrickpli, withtomzhou}@tencent.com Abstract In Visual Dialog, an agent is required to answer a ques- tion given the dialog history and the visual context. In or- Visual Dialog is a vision-language task that requires an AI der to make an appropriate response, it is necessary for the agent to engage in a conversation with humans grounded agent to gain a proper understanding of the question, which in an image. It remains a challenging task since it requires requires it to exploit the textual dialog history and the vi- the agent to fully understand a given question before mak- sual context. To this end, some studies (Das et al. 2017; ing an appropriate response not only from the textual dia- Lu et al. 2017a) design models to obtain features from TI - DMRM: A Dual-Channel Multi-Hop Reasoning Model for Visual Dialog JF - Proceedings of the AAAI Conference on Artificial Intelligence DO - 10.1609/aaai.v34i05.6248 DA - 2020-04-03 UR - https://www.deepdyve.com/lp/unpaywall/dmrm-a-dual-channel-multi-hop-reasoning-model-for-visual-dialog-6WciqLpwzg DP - DeepDyve ER -