汽车工程 ›› 2021, Vol. 43 ›› Issue (10): 1427-1434.doi: 10.19562/j.chinasae.qcgc.2021.10.002

• • 上一篇    下一篇

面向视频结构化的细粒度车辆检测分类模型

石健1,成前1,金立生2,胡耀光1,蒋晓蓓1,郭柏苍2,王武宏1()   

  1. 1.北京理工大学机械与车辆学院,北京 100081
    2.燕山大学车辆与能源学院,秦皇岛 066004
  • 收稿日期:2021-04-20 修回日期:2021-06-05 出版日期:2021-10-25 发布日期:2021-10-25
  • 通讯作者: 王武宏 E-mail:wangwuhong@bit.edu.cn
  • 基金资助:
    国家自然科学基金(51878045)

Fine⁃grained Vehicle Detection and Classification Model for Video Structuring Description

Jian Shi1,Qian Cheng1,Lisheng Jin2,Yaoguang Hu1,Xiaobei Jiang1,Baicang Guo2,Wuhong Wang1()   

  1. 1.School of Mechanical Engineering,Beijing Institute of Technology,Beijing 100081
    2.School of Vehicle and Energy,Yanshan University,Qinhuangdao 066004
  • Received:2021-04-20 Revised:2021-06-05 Online:2021-10-25 Published:2021-10-25
  • Contact: Wuhong Wang E-mail:wangwuhong@bit.edu.cn

摘要:

针对无人驾驶环境感知技术中存在复杂交通场景理解受限的问题,本文中提出一种面向路侧端的视频结构化框架,通过丰富交通场景中不同目标的细粒度信息,提高复杂交通场景的理解能力。针对所提出的视频结构化框架,提供了一种可工程化的细粒度车辆检测分类模型。通过通道剪枝策略对YOLOv4算法进行优化,使压缩模型YOLOv4?Pruned在mAP几乎不变的情况下,较原模型体积减小约60%。设计了16种类型、12种颜色的车辆分类方法可有效覆盖当前交通场景下的运行车辆,在测试集上的分类准确率可达93%。本文设计的细粒度车辆检测分类模型在1920×1080输入,NVIDIA GeForce RTX 2080Ti 下可稳定在23FPS,在海思Hi3516DV300下,未量化的模型可稳定在13FPS。

关键词: 无人驾驶技术, 路侧端环境感知, 视频结构化算法, 细粒度车辆检测分类

Abstract:

In order to solve the problem of limited understanding of complex traffic scenes in driverless environment perception technology, this paper proposes a roadside-oriented video structured description framework, which can enrich the fine-grained information of different targets in traffic scenes and improve the understanding ability of complex traffic scenes. For the proposed framework, this paper provides an engineering fine-grained vehicle detection and classification model. The YOLOv4 algorithm is optimized by channel pruning strategy, and the volume of the compressed model, YOLOv4-Pruned, is reduced by about 60% compared with the original model under the condition that mAP is almost unchanged. A vehicle classification method with 16 types and 12 colors is designed, which can effectively cover all vehicles in the current traffic scene. And the classification accuracy of the test set can reach 93%. The fine-grained vehicle detection and classification model designed in this paper is stable at 23FPS under 1920 × 1080 pixel input, NVIDIA Geforce RTX 2080ti, and the unquantified model is stable at 13FPS under Hisilicon-Hi3516DV300.

Key words: driverless technology, roadside environment perception, video structuring description algorithm, fine?grained vehicle detection and classification