This repository contains an implementation of "Importance Weighted Actor-Learner Architectures", along with a dynamic batching module.
05 Feb 2018
Theory & foundations
Soham De, Samuel L. Smith, arXiv 2020
We present a new method for training reinforcement learning agents from human feedback in the presence of unknown unsafe...
13 Dec 2019