TY - JOUR AU - Hayashi, Akira AB - A Multiagent Reinforcement Learning Algorithm using Extended Optimal Response Nobuo Suematsu Faculty of Information Sciences Hiroshima City University Hiroshima, 731-3194, Japan Akira Hayashi Faculty of Information Sciences Hiroshima City University Hiroshima, 731-3194, Japan suematsu@im.hiroshima-cu.ac.jp ABSTRACT Stochastic games provides a theoretical framework to multiagent reinforcement learning. Based on the framework, a multiagent reinforcement learning algorithm for zero-sum stochastic games was proposed by Littman and it was extended to general-sum games by Hu and Wellman. Given a stochastic game, if all agents learn with their algorithm, we can expect that the policies of the agents converge to a Nash equilibrium. However, agents with their algorithm always try to converge to a Nash equilibrium independent of the policies used by the other agents. In addition, in case there are multiple Nash equilibria, agents must agree on the equilibrium where they want to reach. Thus, their algorithm lacks adaptability in a sense. In this paper, we propose a multiagent reinforcement learning algorithm. The algorithm uses the extended optimal response which we introduce in this paper. It will converge to a Nash equilibrium when other agents are adaptable, otherwise it will make an optimal response. We also provide some empirical results in three TI - A multiagent reinforcement learning algorithm using extended optimal response DO - 10.1145/544741.544831 DA - 2002-07-15 UR - https://www.deepdyve.com/lp/association-for-computing-machinery/a-multiagent-reinforcement-learning-algorithm-using-extended-optimal-BlzlqpIl2k DP - DeepDyve ER -