In reinforcement learning (RL) applications, random perturbations influence the exact amount of reward received. A typical algorithm predicts the average reward across multiple trials, and uses this prediction to decide how to act. In our latest work, we show it is equally possible to model not only the average but also the full variation of this reward, what we call the value distribution. This not only results in RL systems that are more accurate and faster to train than previous models, but more importantly opens up the possibility of rethinking the whole of reinforcement learning.
Irina joined DeepMind’s neuroscience team in 2015 following a PhD at Oxford University. Her work takes inspiration from the way that babies learn from the world around them, using unsupervised interactions with the environment, copying others, and testing new hypotheses. By developing AI methods that can do this she hopes to create far more capable, resilient algorithms that can adapt to new challenges and perform a larger number of tasks. She joined DeepMind as it gave her a huge amount of freedom to work on the ideas she is passionate about in an “inspirational, incredible” environment – a “heaven for AI geeks.”
Senior Research Scientist
Shakir grew up in Johannesburg, South Africa, and initially pursued a degree in electrical and information engineering before becoming intrigued by the principles of learning systems and moving onto graduate study at Cambridge University and the Canadian Institute for Advanced Research (CIFAR) exploring Neural Computation and Adaptive Perception. He then joined DeepMind as a Research Scientist, exploring the fundamentals of imagination, reasoning, and future thinking without the need for external signals. He loves working at DeepMind because of its unique environment that embraces and encourages different approaches to machine learning, and relishes the opportunity to regularly think about the ways in which machine learning and AI can be used to truly overcome the challenges facing humanity.
Senior Research Scientist
Raia is a Senior Research Scientist working on Deep Learning at DeepMind, with a particular focus on solving robotics and navigation using deep neural networks. She joined DeepMind following positions at Carnegie Mellon and SRI International as she saw the combination of research into games, neuroscience, deep learning and reinforcement learning as a unique proposition that could lead to fundamental breakthroughs in AI. She says that one of her favourite moments at DeepMind was watching the livestream of Lee Sedol playing AlphaGo at 4am surrounded by the rest of the team, despite the difference in timezone!