Kinetics

About

Kinetics-600 is a large-scale, high-quality dataset of YouTube video URLs which include a diverse range of human focused actions. Our aim in releasing the Kinetics dataset is to help the machine learning community to advance models for video understanding. It is an approximate super-set of the initial Kinetics dataset released in 2017, now called Kinetics-400.  

The dataset consists of approximately 500,000 video clips, and covers 600 human action classes with at least 600 video clips for each action class. Each clip lasts around 10 seconds and is labeled with a single class. All of the clips have been through multiple rounds of human annotation, and each is taken from a unique YouTube video. The actions cover a broad range of classes including human-object interactions such as playing instruments, as well as human-human interactions such as shaking hands and hugging.

Kinetics forms the basis of an international human action classification competition being organised by ActivityNet

Paper

For a detailed description of how the dataset was compiled and baseline classifier performance see our paper.

The Kinetics Human Action Video Dataset
Will Kay, Joao Carreira,  Karen Simonyan, Brian Zhang, Chloe Hillier, Sudheendra Vijayanarasimhan, Fabio Viola, Tim Green, Trevor Back, Paul Natsev, Mustafa Suleyman, Andrew Zisserman,
arXiv:1705.06950, May 2017

Please cite the paper if you use the dataset.

Download

Kinetics-600

These files were replaced on 1st May 2018 due to a splitting issue with the dataset:

Kinetics-400

Note, these files are updated periodically to remove video links that have been deleted on YouTube or have been made non-public.

The dataset is made available by Google, Inc. under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.

To provide suggestions for new human action classes and other feedback on the dataset click here.

Browse the Dataset

Explore a selection of clips from the dataset. You can also browse full screen.

A cautionary note on the use of this dataset: Kinetics is drawn from the videos uploaded to YouTube, based on the title of the video provided by the uploader. This means that the clips obtained reflect the distribution of the uploaded videos. For example, some classes may contain predominantly males or females, and there might be a bias towards exciting and unusual videos. Consequently, the dataset is neither intended to be a canonical catalogue of human activities, nor are the example clips for the included action classes intended to be canonical representations of these actions. In particular, the distribution of gender, race, age or other factors across the depicted human actors should not be interpreted as representing the actual distribution of human actors.

Meet the Kinetics Team

Will Kay

Product Manager

João Carreira

Research Scientist

Eric Noland

Software Engineer

Brian Zhang

Research Engineer

Chloe Hillier

Program Manager

Prof. Andrew Zisserman

Research Scientist