Topics -

Rogue

We study many intresting properties in reinforcement learning in Rogue, including limited observation, exploration in maze, survival, and generalization.

Kanagawa, Y. and T. Kaneko “Rogue-Gym: A New Challenge for Generalization in Reinforce- ment Learning,” in IEEE Conference on Games (CoG), pp. 1–8 (2019), DOI: 10.1109/CIG.2019.8848075
https://github.com/kngwyu/rogue-gym

Catan

Catan is a multi-player imperfect-information game, involving negotiation among players.

Gendre, Q. and T. Kaneko “Playing Catan with Cross-Dimensional Neural Network,” in ICONIP, pp. 580–592: Springer (2020a), DOI: 10.1007/978-3-030-63833-7_49
https://github.com/Swynfel/rust-catan

Ceramic

Azul is a multi-player perfect-information game.

Reinforcement Learning in Half Field Offense

Half field offense RoboCup 2D Soccer (HFO) is a subtask in RoboCup simulated soccer. Reinforcement learning in HFO is a challenging research topic. We developed hierarchical advantage and present HA-PPO to properly handle parameterized actions (e.g., turn (action) 30 degrees (parameter)) in PPO. The successful goal rate on one v.s. one task (agent v.s. the built-in goal keeper) is about 71% that is the best achieved by reinfoncement learning agents.

In reinforcement learning (RL), agents improve their ability through interaction with an environment. The goal of RL in games is to make good agents without giving domain knowledge. AlphaZero is such an application of RL into games.

Domains

HFO
Rogue, Catan, StarCraft II

Exploration

DEIR

AlphaZero

Nakayashiki, T. and Kaneko, T. “Maximum entropy reinforcement learning in two-player perfect information games,” IEEE SSCI, pp. 1-8. 2021 doi 10.1109/SSCI50451.2021.9659991

Option/skill

Kanagawa, Y. and T. Kaneko “Diverse Exploration via InfoMax Options,” Arxiv, Vol. 2010.02756, pp. 1–21 (2020), https://arxiv.org/abs/2010.02756.

Atari

Zhu, H. and T. Kaneko “Residual Network for Deep Reinforcement Learning with Attention Mechanism,” J. Inf. Sci. Eng., Vol. 37, No. 3, pp. 517–533 (2021), DOI: 10.6688/JISE.20210537(3) .0002.
Hyunwoo, O. and T. Kaneko “Deep Recurrent Q-Network with Truncated History,” in IEEE Technologies and Applications of Artificial Intelligence, pp. 34–39 (2018), DOI: 10.1109/TAAI.2018. 00017.

note

for general introduction for RL studies, see e.g., openai’s documents

Computer Go.

Papers

Mandai, Y. and T. Kaneko “RankNet for evaluation functions of the game of Go,” ICGA Jour- nal, Vol. 41, No. 2, pp. 78–91 (2019), DOI: 10.3233/ICG-190108.
Evaluation of Game Tree Search Methods by Game Records Takeuchi, S.; Kaneko, T.; Yamaguchi, K.; IEEE Transactions on Computational Intelligence and AI in Games, 2 (4), 288 - 302, Dec. 2010.
H. Yoshimoto, K. Yoshizoe, T. Kaneko, A. Kishimoto, and K. Taura: Monte Carlo Go Has a Way to Go, Twenty-First National Conference on Artificial Intelligence (AAAI-06), pages 1070-1075, 2006

Migo

https://github.com/tkaneko/migo

Parallel or distributed search

S. Yokoyama, T. Kaneko, and T. Tetsuro: Parameter-Free Tree Style Pipeline in Asynchronous Parallel Game-Tree Search, The 14th International Conference on Advances in Computers and Games
Scalable Distributed Monte-Carlo Tree Search. Kazuki Yoshizoe, Akihiro Kishimoto, Tomoyuki Kaneko, Haruhiro Yoshimoto and Yutaka Ishikawa. In Proceedings of the 4th Symposium on Combinatorial Search (SoCS'2011), pages 180-187, 2011

Monte-Carlo tree search

Y. Mandai and T. Kaneko: LinUCB Applied to Monte Carlo Tree Search, Theoretical Computer Science. Volume 644, 6 September 2016, pp.114-126.
T. Imagawa and T. Kaneko: Monte Carlo Tree Search with Robust Exploration, LNCS, Computers and Games 2016

Learning algorithms

Kaneko, T. and T. Takizawa “Computer Shogi Tournaments and Techniques,” IEEE Transac- tions on Games, Vol. 11, No. 3, pp. 267–274 (2019), DOI: 10.1109/TG.2019.2939259.
S. Wan and T. Kaneko. Heterogeneous Multi-Task Learning of Evaluation Functions for Chess and Shogi, ICONIP 2018.
Wan, S. and T. Kaneko “Pos2Pos: Automatic Position-to-Position Translation in Chess-Like Games,” in 23rd Game Programming Workshop, pp. 51–54, 11 (2018).
S. Wan and T. Kaneko. Building Evaluation Functions for Chess and Shogi with Uniformity Regularization Networks, IEEE CIG 2018
Zhu, H. and T. Kaneko “Comparison of Loss Functions for Training of Deep Neural Networks in Shogi,” in IEEE Technologies and Applications of Artificial Intelligence, pp. 18–23 (2018), DOI: 10.1109/TAAI.2018.00014.
K. Hoki and T. Kaneko (2014) “Large-Scale Optimization for Evaluation Functions with Minmax Search”, Volume 49, pages 527-568. JAIR

Playing styles

S. Omori and T. Kaneko:Learning of Evaluation Functions to Realize Playing Styles in Shogi, LNCS, PRICAI. 367-379.

Denou-sen (2013)

GPS Shogi won against Miura 8-dan (now 9-dan) in April, 2013

Topics

Various games