Learning robust multi-modal policies for industrial robotic tasks via deep reinforcement learning and demonstrations


We present a framework for learning robust manipulation skills in industrial settings with different sensory input modalities. At the heart of our framework is a reinforcement learning agent: deep deterministic policy gradient from demonstration (DDPGfD); we make few extensions to it so that better sample efficiency and generalization capability can be achieved. We first consider an industrial robotic manipulation benchmark -- NIST board challenge, we show that our method can solve these insertion tasks under large perturbations with a 99.99% success rate out of 10000 trials; outperforming the solutions current robotic system integrator by large margins. Additionally, we validate our methods on two more challenging tasks: 1). The robot needs to perform insertions while the target is moving simultaneously. 2).The robot needs to precisely insert a key to a lock by properly using haptic information. In all these tasks, our method is able to learn successful policies under few hours of real-world interactions but achieving industrial desirable reliability with minimum hyperparameter tuning; it also enables robots to learn very challenging manipulation skills that are almost impossible for the robotic industry today. Overall, our framework is the first RL agent systematically evaluated under an industrial benchmark; and we believe this is the first step to bring general intelligence to the robot market today.