People Detection benchmark repository

Results


Results: In this section, we describe the experiments performed over the experimental dataset and including different approaches from the state of the art. All the approaches use the default settings proposed by their respective authors.

In order to evaluate all the different approaches over the proposed dataset. Firstly, we present the experimental results for each video sequence. And, after that, we present the results for each proposed different complexity categories: background complexity and classification complexity. In addition, we make use of the average AUC and algorithm ranking along each evaluation.

Results for each video sequence:

Video

HOG[1]

ISM[2]

Edge[4]

DTDP[5]

ACF[6]
Inria

ACF[6]
Caltech

Faster-RCNN[7] VOC0712_ZF

Faster-RCNN[7] VOC0712_VGG

1

89.3

71.4

34.9

84.5

96.7

99.3

86.4

99.8
99.7

2

63.2

82.9

92.5

90.2

66.3

77.1

92.9

80.8
98.2

3

55.6

75.7

64.3

71.7

69.7

68.9

56.0

78.0
82.9

4

10.1

1.0

0.5

5.4

13.2

33.9

59.1

20.8
37.5

5

61.3

71.2

3.5

7.1

85.1

51.6

24.7

96.3
96.2

6

49.9

34.6

8.1

31.7

74.9

67.2

54.4

88.4
93.1

7

0.0

3.0

7.6

7.2

11.0

2.7

89.4

70.1
75.9

8

0.0

12.2

17.7

21.4

0.0

4.6

33.4

59.0
79.8

9

47.4

65.9

45.9

74.4

90.4

72.2

76.1

93.1
94.7

10

7.2

5.5

21.2

14.3

0.7

8.7

59.5

79.4
91.0

11

10.7

5.9

2.1

33.7

11.4

8.6

34.8

44.8
95.4

12

82.0

76.2

65.7

70.5

92.4

91.1

92.6

98.6
99.6

13

70.9

73.6

15.5

59.3

80.2

87.2

81.6

93.3
98.6

14

13.6

29.2

12.7

46.9

44.1

21.3

51.1

85.7
95.9

15

46.5

20.5

16.6

41.2

67.9

70.0

86.2

97.4
98.5

16

21.1

71.5

37.7

60.9

83.3

89.6

54.5

94.3
95.9

Average AUC

39.3

43.8

27.9

45.0

55.4

53.4

64.5

80.0
90.0

Results for each background complexity:

Background complexity

HOG[1]

ISM[2]

Edge[4]

DTDP[5]

ACF[6]
Inria

ACF[6]
Caltech

Faster-RCNN[7] VOC0712_ZF

Faster-RCNN[7] VOC0712_VGG

Baseline

69.4

76.7

63.9

82.1

77.6

81.8

78.4

86.2
93.6

Dynamic Background

35.7

36.1

2.0

6.3

49.2

42.8

41.9

58.5
66.9

Camera Jitter

25.0

18.8

7.9

19.4

42.9

34.9

71.9

79.2
84.5

Intermittent Object Motion

16.3

22.4

21.7

35.9

25.6

23.5

51.0

69.1
90.2

Shadow

46.8

54.2

29.6

55.8

73.6

71.8

73.2

93.8
97.7

Average AUC

38.6

41.6

25.0

39.9

53.8

51.0

63.3

77.4
86.6

 

Results for each classification complexity:

Classification complexity

HOG[1]

ISM[2]

Edge[4]

DTDP[5]

ACF[6]
Inria

ACF[6]
Caltech

Faster-RCNN[7] VOC0712_ZF

Faster-RCNN[7] VOC0712_VGG

Low

62.3

73.6

48.7

73.3

84.9

86.1

80.7

93.3
97.8

Medium

53.3

50.5

23.1

37.9

74.4

64.4

55.3

90.0
92.7

High

6.9

9.5

10.3

21.5

13.4

13.3

54.5

60.0
79.2

Average AUC

40.8

44.5

27.4

44.2

57.6

54.6

63.5

81.1
89.9

 

Associated references:

[1] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Proc. of CVPR, 2005, pp. 886-893..

[2] B. Leibe, E. Seemann, and B. Schiele. Pedestrian detection in crowded scenes. In Proc. of CVPR, 2005, pp. 878-885.

[3] V. Fernandez-Carbajales, M. A. Garcia, and J. M. Martinez, “Robust people detection by fusion of evidence from multiple methods,” in Proc. of WIAMIS, 2008, pp. 55–58.

[4] A. Garcia-Martin and J. M. Martinez, “Robust real time moving people detection in surveillance scenarios,” in Proc. of AVSS, 2010, pp. 241–247.

[5] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. PAMI, September 2010, Vol. 32(9), pp. 1627-1645..

[6] P. Dollar, R. Appel, and W. Kienzle, “Crosstalk cascades for framerate pedestrian detection,” in Proc. of ECCV 2012, no. 645-659.

[7] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” in Proc. of NIPS 2015.