"Disturbed YouTube for Kids: Characterizing and Detecting Inappropriate Videos Targeting Young Children"

Kostantinos Papadamou, Cyprus University of Technology
Antonis Papasavva, Cyprus University of Technology
Savvas Zannettou, Cyprus University of Technology
Jeremy Blackburn, University of Alabama at Birmingham
Nicolas Kourtellis, Telefonica Research
Ilias Leontiadis, Telefonica Research
Gianluca Stringhini, Boston University
Michael Sirivianos, Cyprus University of Technology

Publication Date

1-31-2020

Abstract

Dataset for paper: Disturbed YouTube for Kids: Characterizing and Detecting Inappropriate Videos Targeting Young Children

The dataset consists of five files:
1. groundtruth_videos.json: This is the ground truth dataset. We have 4797 manually annotated videos (1513 suitable, 929 disturbing, 419 restricted, and 1936 irrelevant). You can distinguish among the different labels by observing the 'classification_label' field.
2. elsagate_related_videos.json: Contains the data for 233K elsagate-related YouTube videos (1K seed and 232K recommended) that were obtained as described in the paper.
3. other_child_related_videos.json: Contains the data for 155K other child-related YouTube videos (2K seed and 153K recommended) that were obtained as described in the paper.
4. random_videos.json: Contains the data for 482K random YouTube videos (8K seed and 474K recommended) that were obtained as described in the paper.
5. popular_videos.json: Contains the data for 11K popular YouTube videos (500 seed and 10.5K recommended) that were obtained between November 18 and November 21, 2018, as described in the paper.

For each video in all sets, you can check the predicted label of our classifier by observing the 'prediction' field.

Related Items

Is supplement to: 10.5281/zenodo.3739061
Is supplement to: 10.5281/zenodo.4534217

Repository

Zenodo

Access Instructions

Access to this data is restricted.

Funder

Funder: European Commission
Funder DOI: 10.13039/501100000780
EnhaNcing seCurity And privacy in the Social wEb: a user centered approach for the protection of minors
691025

Link to Dataset

COinS

"Disturbed YouTube for Kids: Characterizing and Detecting Inappropriate Videos Targeting Young Children"

Publication Date

Abstract

Related Items

Repository

Access Instructions

Funder

Search

Browse

Author Corner

Research Data Catalog

"Disturbed YouTube for Kids: Characterizing and Detecting Inappropriate Videos Targeting Young Children"

Authors

Publication Date

Abstract

Related Items

Repository

Access Instructions

Funder

Share

Search

Browse

Author Corner