強化学習

ゲームにおける強化学習を研究しています．対象ゲームに関する人の知識や棋譜などを使わずに，ゲームのルール，あるいはシミュレータを通しての経験だけから学ぶことに挑戦があります．AlphaZeroも強化学習の応用と位置づけられます．

Domains

HFO
Rogue, Catan, StarCraft II

Exploration

AlphaZero

Nakayashiki, T. and Kaneko, T. “Maximum entropy reinforcement learning in two-player perfect information games,” IEEE SSCI, pp. 1-8. 2021 doi 10.1109/SSCI50451.2021.9659991

Option/skill

Kanagawa, Y. and T. Kaneko “Diverse Exploration via InfoMax Options,” Arxiv, Vol. 2010.02756, pp. 1–21 (2020), https://arxiv.org/abs/2010.02756.

Maximum entropy reinforcement learning

合田拓矢・金子知適「離散行動空間における Soft Actor-Critic の評価」，第 25 回ゲームプログラミングワークショップ，175–180 (2020).

Atari

Zhu, H. and T. Kaneko “Residual Network for Deep Reinforcement Learning with Attention Mechanism,” J. Inf. Sci. Eng., Vol. 37, No. 3, pp. 517–533 (2021), DOI: 10.6688/JISE.20210537(3) .0002.
Hyunwoo, O. and T. Kaneko “Deep Recurrent Q-Network with Truncated History,” in IEEE Technologies and Applications of Artificial Intelligence, pp. 34–39 (2018), DOI: 10.1109/TAAI.2018. 00017.