The first computer program to ever beat a professional player at the game of Go.
Better than human-level control of classic Atari games through Deep Reinforcement Learning.
Cannes Lions Innovation Award 2016 AlphaGo is celebrated for its pioneering technological creativity with a Grand Prix Lion (best of Gold)
Dueling Network Architecture for Deep Reinforcement Learning
Pixel Recurrent Neural Networks
Thompson Sampling is Asymptotically Optimal in General Environments
A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in challenging domains. Recently, AlphaGo became the first program to defeat a world champion in the game of Go. The tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from selfplay. Here, we introduce an algorithm based solely on reinforcement learning, without human data, guidance, or domain knowledge beyond game rules. AlphaGo becomes its own teacher: a neural network is trained to predict AlphaGo’s own move selections and also the winner of AlphaGo’s games. This neural network improves the strength of tree search, resulting in higher quality move selection and stronger self-play in the next iteration. Starting tabula rasa, our new program AlphaGo Zero achieved superhuman performance, winning 100-0 against the previously published, champion-defeating AlphaGo.
This paper introduces WaveNet, a deep neural network for generating raw audio waveforms. The model is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones; nonetheless we show that it can be efficiently trained on data with tens of thousands of samples per second of audio. When applied to text-to-speech, it yields state-of-the-art performance, with human listeners rating it as significantly morenatural sounding than the best parametric and concatenative systems for both English and Mandarin. A single WaveNet can capture the characteristics of many different speakers with equal fidelity, and can switch between them by conditioning on the speaker identity. When trained to model music, we find that it generates novel and often highly realistic musical fragments. We also show that it can be employed as a discriminative model, returning promising results for phoneme recognition.
We're looking for exceptional people.
Andreas is our Head of Research Engineering and joined DeepMind in 2012. One of his earliest memories of DeepMind is having meetings on the 'meeting picnic blanket' in Russell Square, after having run out of space in the first office! Previously Andreas was a postdoc at Imperial College London, working on spiking neural network simulations using GPUs in the Cognitive Robotics Lab. His team works to accelerate the research programme at DeepMind by providing the software used across all research projects, as well as directly working on research projects. Andreas’ main focus is making sure his team is getting to work on interesting problems and that the research team functions smoothly and has the tools and support it needs. He says DeepMind is a “great collaborative environment and the best place to be at the forefront of developments in AI.”
Raia is a Senior Research Scientist working on Deep Learning at DeepMind, with a particular focus on solving robotics and navigation using deep neural networks. She joined DeepMind following positions at Carnegie Mellon and SRI International as she saw the combination of research into games, neuroscience, deep learning and reinforcement learning as a unique proposition that could lead to fundamental breakthroughs in AI. She says that one of her favourite moments at DeepMind was watching the livestream of Lee Sedol playing AlphaGo at 4am surrounded by the rest of the team, despite the difference in timezone!
Frederic joined as a Research Engineer in July 2015. Prior to DeepMind, he was a research engineer at the Foundry, a VFX software company. Frederic’s job is to accelerate research, and take the lead on the engineering side of projects. He mainly focuses on generative models, which is a family of models belonging to the field of unsupervised learning. He describes his job as trying to teach a computer to process data like the human brain: "To dream and imagine things that it has never seen before. One way to achieve this is to show the computer a lot of data and let it figure out why things look like they do.” Frederic joined DeepMind to be a part of our exciting and challenging mission to solve intelligence. His favourite DeepMind memory was watching the AlphaGo vs Lee Sedol match: “The suspense and atmosphere in the office was amazing.”