Learning transferable motor skills with hierarchical latent mixture policies
For robots operating in the real world, it is desirable to learn reusable abstract behaviours that can effectively be transferred across numerous tasks and scenarios. We propose an approach to learn skills from data using a hierarchical mixture latent variable model. Our method exploits a multi-level hierarchy of both discrete and continuous latent variables, to model a discrete set of abstract high-level behaviours while allowing for variance in how they are executed. We demonstrate in manipulation domains that the method can effectively cluster offline data into distinct, executable behaviours, while retaining the flexibility of a continuous latent variable model. The resulting skills can be transferred to new tasks, unseen objects, and from state to vision-based policies, yielding significantly better sample efficiency and asymptotic performance compared to existing skill- and imitation-based methods. We also perform further analysis showing how and when the skills are most beneficial: they encourage directed exploration to better cover regions of the state space relevant to the task, making them most effective in challenging sparse-reward settings.