Bias-Reduced Hindsight Experience Replay with Virtual Goal Prioritization. (arXiv:1905.05498v2 [cs.LG] UPDATED)

Hindsight Experience Replay (HER) is a multi-goal reinforcement learning
algorithm for sparse reward functions. The algorithm treats every failure as a
success for an alternative (virtual) goal that has been achieved in the
episode. Virtual goals are randomly selected, irrespective of which are most
instructive for the agent. In this paper, we present two improvements over the
existing HER algorithm. First, we prioritize virtual goals from which the agent
will learn more valuable information. We call this property the instructiveness
of the virtual goal and define it by a heuristic measure, which expresses how
well the agent will be able to generalize from that virtual goal to actual
goals. Secondly, we reduce existing bias in HER by the removal of misleading
samples. To test our algorithms, we built two challenging environments with
sparse reward functions. Our empirical results in both environments show vast
improvement in the final success rate and sample efficiency when compared to
the original HER algorithm. A video showing experimental results is available
at .

Source link

Related posts

A machine learning approach to estimate hourly exposure to fine particulate matter for urban, rural, and remote populations during wildfire seasons.


Artificial intelligence bot trained to recognize galaxies – EurekAlert (press release)


Harvard Medical School Gets $200M for Precision Medicine Research


This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy