面向视频结构化的细粒度车辆检测分类模型

doi:10.19562/j.chinasae.qcgc.2021.10.002

摘要/Abstract

摘要：

针对无人驾驶环境感知技术中存在复杂交通场景理解受限的问题，本文中提出一种面向路侧端的视频结构化框架，通过丰富交通场景中不同目标的细粒度信息，提高复杂交通场景的理解能力。针对所提出的视频结构化框架，提供了一种可工程化的细粒度车辆检测分类模型。通过通道剪枝策略对YOLOv4算法进行优化，使压缩模型YOLOv4?Pruned在mAP几乎不变的情况下，较原模型体积减小约60%。设计了16种类型、12种颜色的车辆分类方法可有效覆盖当前交通场景下的运行车辆，在测试集上的分类准确率可达93%。本文设计的细粒度车辆检测分类模型在1920×1080输入，NVIDIA GeForce RTX 2080Ti 下可稳定在23FPS，在海思Hi3516DV300下，未量化的模型可稳定在13FPS。

关键词: 无人驾驶技术, 路侧端环境感知, 视频结构化算法, 细粒度车辆检测分类

Abstract:

In order to solve the problem of limited understanding of complex traffic scenes in driverless environment perception technology， this paper proposes a roadside-oriented video structured description framework， which can enrich the fine-grained information of different targets in traffic scenes and improve the understanding ability of complex traffic scenes. For the proposed framework， this paper provides an engineering fine-grained vehicle detection and classification model. The YOLOv4 algorithm is optimized by channel pruning strategy， and the volume of the compressed model， YOLOv4-Pruned， is reduced by about 60% compared with the original model under the condition that mAP is almost unchanged. A vehicle classification method with 16 types and 12 colors is designed， which can effectively cover all vehicles in the current traffic scene. And the classification accuracy of the test set can reach 93%. The fine-grained vehicle detection and classification model designed in this paper is stable at 23FPS under 1920 × 1080 pixel input， NVIDIA Geforce RTX 2080ti， and the unquantified model is stable at 13FPS under Hisilicon-Hi3516DV300.

Key words: driverless technology, roadside environment perception, video structuring description algorithm, fine?grained vehicle detection and classification

石健,成前,金立生,胡耀光,蒋晓蓓,郭柏苍,王武宏. 面向视频结构化的细粒度车辆检测分类模型[J]. 汽车工程, 2021, 43(10): 1427-1434.

Jian Shi,Qian Cheng,Lisheng Jin,Yaoguang Hu,Xiaobei Jiang,Baicang Guo,Wuhong Wang. Fine⁃grained Vehicle Detection and Classification Model for Video Structuring Description[J]. Automotive Engineering, 2021, 43(10): 1427-1434.

图/表 14

图1

图2

图3

图4

表1

表2

图5

表3

图6

图7

表4

表5

表6

表7

参考文献 25

1	WEI S G， YU D， GUO C L， et al. Survey of connected automated vehicle perception mode： from autonomy to interaction［J］. IET Intelligent Transport Systems，2019，13（3）.
2	代凯，申棋仁，马芳武，等.基于激光雷达的SLAM和融合定位方法综述［J］.汽车文摘，2021（2）：1-8.
	DAI K， SHEN Q R， MA F W， et al. A review of lidar based SLAM and multi-sensor fusion for localization［J］. Automotive Digest，2021（2）：1-8.
3	HARIKRISHNAN P M， THOMAS A， GOPI V P， et al. Inception single shot multi-box detector with affinity propagation clustering and their application in multi-class vehicle counting［J］. Applied Intelligence，2021（51）： 4714–4729.
4	金立生，郭柏苍，石健，等.基于改进YOLOv3的车辆前方动态多目标检测算法［J/OL］.吉林大学学报（工学版）：1-9［2021-03-18］..
	JIN L S， GUO B C， SHI J， et al. Dynamic multiple object detection algorithm for vehicle forward based on improved YOLOv3［J/OL］. Journal of Jilin University （Engineering and Technology Edition）： 1-9［2021-03-18］..
5	曹磊，王强，史润佳，等.基于改进RPN的Faster-RCNN网络SAR图像车辆目标检测方法［J］.东南大学学报（自然科学版），2021，51（1）：87-91.
	CAO L， WANG Q， SHI R J， et al. Method for vehicle target detection on SAR image based on improved RPN in Faster-RCNN［J］. Journal of Southeast University （Natural Science Edition）， 2021，51（1）：87-91.
6	HU J J， SUN Y Q， XIONG S S. Research on the cascade vehicle detection method based on CNN［J］. Electronics，2021，10（4）.
7	李达，张照生，刘鹏，等.基于改进长短时记忆神经网络-自适应增强算法的多天气车辆分类方法［J］.汽车工程，2020，42（9）：1248-1255.
	LI D， ZHANG Z S， LIU P， et al. Vehicle classification method in multi-climates based on modified LSTM-AdaBoost algorithm［J］. Automotive Engineering， 2020，42（9）：1248-1255.
8	SHEN C， ZHAO X M， LIU Z W， et al. Joint vehicle detection and distance prediction via monocular depth estimation［J］. IET Intelligent Transport Systems，2020，14（7）.
9	DAI D Y， WANG J K， CHEN Z H， et al. Image guidance based 3D vehicle detection in traffic scene［J］. Neurocomputing，2021，428.
10	胡飞，陈建新，阮涛，等.高速公路事件感知设备选型与设计［J］.中国公路，2020（17）：106-108.
	HU F， CHEN J X， RUAN T， et al. Selection and design of expressway event perception equipment［J］. China Highway，2020（17）：106-108.
11	贾志城.城市智慧公安综合视频系统关键技术研究——基于集成服务总线、视频结构化、深度学习的视角［J］.中国安全防范技术与应用，2021（1）：21-23.
	JIA Z C. Research on key technologies of urban intelligent police integrated video system—based on the perspective of integrated service bus， video structured and deep learning［J］. China Security Protection Technology and Application，2021（1）：21-23.
12	HU M Y， HAN H， SHAN S G， et al. Weakly supervised image classification through noise regularization［C］. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）， Long Beach， USA， 2019： 11509-11517.
13	BOCHKOVSKIY A， WANG C Y， LIAO M H Y. YOLOv4： optimal speed and accuracy of object detection［J］. arXiv preprint arXiv：， 2020.
14	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection［C］. 2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）， Las Vegas， USA， 2016： 779-788.
15	REDMON J， FARHADI A. YOLO9000： better， faster， stronger［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition（CVPR）， Honolulu， USA， 2017： 6517-6525.
16	REDMON J， FARHADI A. YOLOv3： an incremental improvement［J］. arXiv Preprint arXiv：， 2018.
17	ZHANG X， HE Y， JIAN S. Channel pruning for accelerating very deep neural networks［C］. 2017 IEEE International Conference on Computer Vision（ICCV）， Venice， Italy， 2017： 1398-1406.
18	ZHUANG Z W， TAN M K， ZHUANG B H， et al. Discrimination-aware channel pruning for deep neural networks［C］.2018 Conference and Workshop on Neural Information Processing Systems （NIPS）， Montreal， Canada， 2018： 875-886.
19	HE Y， LIU P， WANG Z， et al. Filter pruning via geometric median for deep convolutional neural networks acceleration［C］. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）， Long Beach， USA， 2019： 4335-4344.
20	LI Q H， LI C P， CHEN H. Filter pruning via probabilistic model-based optimization for accelerating deep convolutional neural networks［C］. The Fourteenth ACM International Conference on Web Search and Data Mining（WSDM '21）， New York， USA， 2021： 653-661.
21	IOFFE S， SZEGEDY C. Batch normalization： accelerating deep network training by reducing internal covariate shift［J］. arXiv Preprint arXiv：， 2015.
22	HE S， LUO H， CHEN W， et al. Multi⁃domain learning and identity mining for vehicle reidentification［C］. 2020 IEEE Conference on Computer Vision and Pattern Recognition Workshop （CVPRW）， Seattle， USA， 2020： 2485-2493.
23	WEN Y D， ZHANG K P， LI Z F， et al. A discriminative feature learning approach for deep face recognition［C］. 2016 European Conference on Computer Vision（ECCV）， Amsterdam， The Netherlands， 2016： 499-515.
24	YU F， CHEN H， WANG X， et al. BDD100K： a diverse driving dataset for heterogeneous multitask learning［C］. 2020 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）， Seattle， USA， 2020： 2633-2642.
25	LOU Y H， BAI Y， LIU J， et al. VERI-Wild： a large dataset and a new method for vehicle re-identification in the wild［C］. 2019 IEEE Conference on Computer Vision and Pattern Recognition （CVPR） Long Beach， USA， 2019： 3230-3238.

序号	标签	类型
1	Sedan	轿车
2	SUV	运动型多用途汽车
3	Bus	公交车
4	Minivan	面包车
5	Smalltruck	微型货车
6	Fencetruck	轻型货车
7	Mediumtruck	中型货车
8	Largetruck	大型货车
9	HGV	重型货车
10	Pickuptruck	皮卡
11	MPV	商务车
12	LPV	轻型客车
13	Minibus	中型客车
14	Intercitybus	大型客车
15	Containertruck	集装箱车
16	Schoolbus	校车

序号	标签	颜色
1	White	白色
2	Gray	灰色
3	Black	黑色
4	Blue	蓝色
5	Yellow	黄色
6	Red	红色
7	Green	绿色
8	Cyan	蓝绿色（青色）
9	Dark gray	深灰色
10	Brown	褐色
11	Purple	紫色
12	Golden	金色

序号	标签	训练集/张	测试集/张
1	Sedan	215 467	5 000
2	SUV	87 886	4 000
3	Bus	6 974	500
4	Minivan	24 932	1 000
5	Smalltruck	5 488	200
6	Fencetruck	2 195	100
7	Mediumtruck	5 526	200
8	Largetruck	2 003	100
9	HGV	1 781	100
10	Pickuptruck	6 194	250
11	MPV	18 798	500
12	LPV	3 760	100
13	Minibus	2 900	100
14	Intercitybus	2 642	100
15	Containertruck	2 016	100
16	Schoolbus	3 032	150

模型	模型体积/MB	前向推理时间/ms	mAP
YOLOv4-Baseline	200.6	50.7	0.815
YOLOv4-Pruned	78.7	29.5	0.809
YOLOv3	246.4	54	0.780