Reinforcement Learning in Half Field Offense
Half field offense RoboCup 2D Soccer (HFO) is a subtask in RoboCup simulated soccer.
Reinforcement learning in HFO is a challenging research topic.
We developed hierarchical advantage and present HA-PPO to properly handle parameterized actions (e.g.,
30 degrees (parameter)) in PPO. The successful goal rate on one v.s. one task (agent v.s. the built-in goal keeper) is about 71% that is the best achieved by reinfoncement learning agents.
- Hu, Z. and T. Kaneko. “Hierarchical Advantage for Reinforcement Learning in Parameterized Action Space,” in IEEE International Conference on Computers and Games (2021). https://ieee-cog.org/2021/assets/papers/paper_211.pdf