Model Primitive Hierarchical Lifelong Reinforcement Learning. (arXiv:1903.01567v1 [cs.LG])

Learning interpretable and transferable subpolicies and performing task
decomposition from a single, complex task is difficult. Some traditional
hierarchical reinforcement learning techniques enforce this decomposition in a
top-down manner, while meta-learning techniques require a task distribution at
hand to learn such decompositions. This paper presents a framework for using
diverse suboptimal world models to decompose complex task solutions into
simpler modular subpolicies. This framework performs automatic decomposition of
a single source task in a bottom up manner, concurrently learning the required
modular subpolicies as well as a controller to coordinate them. We perform a
series of experiments on high dimensional continuous action control tasks to
demonstrate the effectiveness of this approach at both complex single task
learning and lifelong learning. Finally, we perform ablation studies to
understand the importance and robustness of different elements in the framework
and limitations to this approach.

Source link

Related posts

Kazakhstan MoH: ‘It’s not so much a cultural issue, but more of a digital literacy challenge’


Self-Regularization in Deep Neural Networks: a preview


7 painfully obvious lessons I learned in 2017 while building a startup.


This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy