A Narration-based Reward Shaping Approach using Grounded Natural Language Commands. (arXiv:1911.00497v1 [cs.AI])

While deep reinforcement learning techniques have led to agents that are
successfully able to learn to perform a number of tasks that had been
previously unlearnable, these techniques are still susceptible to the
longstanding problem of reward sparsity. This is especially true for tasks such
as training an agent to play StarCraft II, a real-time strategy game where
reward is only given at the end of a game which is usually very long. While
this problem can be addressed through reward shaping, such approaches typically
require a human expert with specialized knowledge. Inspired by the vision of
enabling reward shaping through the more-accessible paradigm of
natural-language narration, we develop a technique that can provide the
benefits of reward shaping using natural language commands. Our
narration-guided RL agent projects sequences of natural-language commands into
the same high-dimensional representation space as corresponding goal states. We
show that we can get improved performance with our method compared to
traditional reward-shaping approaches. Additionally, we demonstrate the ability
of our method to generalize to unseen natural-language commands.

