Real-time single-view video event recognition in controlled environments

This page presents sample results of  the proposed system to recognize human activities and interactions with environment objects. It is based on extracting multiple features from foreground blobs such as their trajectory, skin regions and people likelihood. Available contextual information is integrated into the system to guide the event recognition process.

Original test sequences with ground truth data are also provided.

Related publication

Real-time single-view video event recognition in controlled environments
J.C. SanMiguel, M. Escudero-Viñolo, J. M. Martínez and J. Bescós
9th International Workshop on Content-Based Multimedia Indexing (CBMI2011)

Contact Information

J.C. SanMiguel  - show email

Sample results

Download the videos1 from the links below to look at the sample experimental results of our proposed system. The system parameters have empirically set according to ligthing conditions and people appearance of each sequence.

1The XviD ISO MPEG-4 codec is needed to visualize the video files (download it here)    


Currently, the dataset contains 17 sequences classified according to its analysis complexity. All sequences were taken using a stationary camera at resolution of 320x240 at 12 fps.

Scenario 1 Scenario 2 Scenario 3

A summary of the annotated events in the dataset and the associated complexity of each category is available here

The video files of the dataset are available here 

© Video Processing & Understanding Lab (