content

An Event Detection dataset

Content

>Event Detection dataset

Category C1

Category C2

Category C3

This section presents, via the left menu, a description of the test sequences for each category along with frame samples, low resolution video previews and the event annotations. Annotations have been done using the VIPER toolkit. The video files have been coded using the MPEG-1 codec in order to be compatible with the VIPER toolkit

Currently, the dataset contains 17 sequences taken using a stationary camera at resolution of 320x240 at 12 fps. The dataset is focused on two types of human-related events: interactions and activities. In particular, two activities (Hand Up and Walking) and three human-object interactions (Leave, Get and Use object) have been annotated.

We have grouped all the test sequences into three categories according to a subjective estimation of the analysis complexity considering:

Foreground complexity (S1), defined as the complexity to extract the foreground due to the presence of edges, multiple textures, lighting changes, reflections, shadows and objects belonging to the background.
Tracking complexity (S2), defined as the difficulty to track foreground blobs in the sequence. It mainly differentiates crowded from less-populated sequences.
Feature complexity (S3), defined as the difficulty to classify moving and temporally stationary foreground in a scenario in order to extract/analyze relevant features.
Event complexity (S4), defined as the difficulty to detect/recognize the annotated events in a scenario. It is related with the velocity of the event execution, the (partial) occlusion of the action performed and the variability in appearance of the actor.

Sample frames of such categories are shown in the following images

C1 category
Category 1 Category 2 Category 3

A summary of the annotated events in the dataset and the associated complexity of each category is available in the following table

dataset categories

The complexity estimation codes are Low (L), Medium (M), High (H) and Very High (V). The events are Leave-object (LEA), Get-object (GET), Use-object (USE), Hand Up (HUP) and Walking (WLK).