Autonomous agents need to make decisions in a sequential manner, under
partially observable environment, and in consideration of how other agents
behave. In critical situations, such decisions need to be made in real time for
example to avoid collisions and recover to safe conditions. We propose a
technique of tree search where a deterministic and pessimistic scenario is used
after a specified depth. Because there is no branching with the deterministic
scenario, the proposed technique allows us to take into account far ahead in
the future in real time. The effectiveness of the proposed technique is
demonstrated in Pommerman, a multi-agent environment used in a NeurIPS 2018
competition, where the agents that implement the proposed technique have won
the first and third places.

Source link