Download - 東京大学関本研究室 Traffic Volume Count from …Ashutosh Kumar, Takehiro Kashiyama, Yoshihide Sekimoto 東京大学関本研究室/ Sekimoto Lab. IIS, the University of Tokyo.

Transcript
Page 1: 東京大学関本研究室 Traffic Volume Count from …Ashutosh Kumar, Takehiro Kashiyama, Yoshihide Sekimoto 東京大学関本研究室/ Sekimoto Lab. IIS, the University of Tokyo.

Traffic Volume Count from surveillance cameras in a highly occluded environment using Deep

Convolutional Neural NetworkAshutosh Kumar, Takehiro Kashiyama, Yoshihide Sekimoto

東京大学 関本研究室 / Sekimoto Lab. IIS, the University of Tokyo.

Background

Sekimoto Lab. @ IIS Human Centered Urban Informatics, the University of Tokyo

(i) To develop a dataset for vehicle detection in occluded environment ofYangon Surveillance Images at Intersection (YSII)

(ii) Examine YOLOv3 for vehicle detection using models trained on the YSIIdataset and the Common Objects in Context (COCO) dataset, and

(iii) Compare the performance of traffic volume counter for vehicles moving indifferent directions at an intersection using the two datasets.

Objectives

Analysis of traffic volume is essential for road traffic infrastructure andtransportation policies. Many developing countries do not have any automatedtraffic volume counter to get traffic data which makes it difficult for thegovernment to make transportation policies for passenger and goods movementefficiently. In this research, we use YOLOv3 together with Simple Online andRealtime Tracking (SORT) algorithm to demonstrate a traffic volume counterfrom surveillance videos. We train YOLOv3 from surveillance video frames atintersections having high occlusion between vehicles for detection and then trackthe vehicles using SORT algorithm. Our model trained with surveillance imagesachieves an overall recall of 85%, outperforming the model trained with theCOCO dataset by 73%. Such traffic volume counter can be useful for traffic flowdata in developing countries as well as for real-time driving assistance systems.

Methodology

We prepare 10,241 annotations of vehicles from the surveillance videos of‘Pyay Road and Hledan Road’ and ‘U Wisara and Dhama Zay Ti’. Vehicles areannotated based on human accuracy.

i) Development of the YSII dataset

ii) Detection of vehiclesFor vehicle detection, we use YOLOv3 neural network architecture as shown inFigure 1with pre-trained convolutional network weights from the ImageNet.

Figure 1. Neural Network architecture for vehicle detection

iii) Tracking and counting the vehiclesFor vehicle tracking, we use Simple Online Realtime tracking (SORT)algorithm. This tracking algorithm requires the detection of vehicles onevery frame. For counting, we use the intersection between the flowline ofthe tracked vehicles and the screen line as shown in Figure 2.

Figure 2. Screen line (in red) registers the vehicle count on intersection with theflow line (in yellow) as shown by the arrows

Results

Dataset Average RecallCOCO 0.49

YSII 0.85

Detection with the COCO dataset Detection with the YSII dataset

i) Detection results with the datasets

ii) Counting of vehicles

20

1517

33

63

8

2218

22

4

35

5

1915

12

36

60

11

32

18

27

6

38

3

20

1512

37

65

10

22

16

26

7

24

6

0

10

20

30

40

50

60

70

SL 1 SL 2 SL 3 SL 4 SL 5 SL 6 SL 7 SL 8 SL 9 SL 10 SL 11 SL 12

Vehicle count at different screen lines YSII COCO Actual

Figure 3. Vehicle count at twelve screen lines