Guided Uncertainty-Aware Policy Optimization: Combining Learning and Model-Based Strategies for Sample-Efficient Policy Learning. (arXiv:2005.10872v1 [cs.RO])

Traditional robotic approaches rely on an accurate model of the environment,
a detailed description of how to perform the task, and a robust perception
system to keep track of the current state. On the other hand, reinforcement
learning approaches can operate directly from raw sensory inputs with only a
reward signal to describe the task, but are extremely sample-inefficient and
brittle. In this work, we combine the strengths of model-based methods with the
flexibility of learning-based methods to obtain a general method that is able
to overcome inaccuracies in the robotics perception/actuation pipeline, while
requiring minimal interactions with the environment. This is achieved by
leveraging uncertainty estimates to divide the space in regions where the given
model-based policy is reliable, and regions where it may have flaws or not be
well defined. In these uncertain regions, we show that a locally learned-policy
can be used directly with raw sensory inputs. We test our algorithm, Guided
Uncertainty-Aware Policy Optimization (GUAPO), on a real-world robot performing
peg insertion. Videos are available at:

Source link

Related posts

C China Province Completes Largest-ever AI-assisted Cancer …


Multistability and multiperiodicity in impulsive hybrid quaternion-valued neural networks with mixed delays.


Thanks Vijay and agree. Can’t wait to read your post as well!


This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy