A large-scale, high-quality dataset of URL links to approximately 650,000 video clips that covers 700 human action classes, including human-object interactions such as playing instruments, as well as human-human interactions such as shaking hands and hugging. Each action class has at least 600 video clips. Each clip is human annotated with a single action class and lasts around 10s.

Kinetics 700

View paperDownload dataset

Kinetics 600

View paperDownload dataset

Kinetics 400

View paperDownload dataset


22 May 2017