汽车工程 ›› 2022, Vol. 44 ›› Issue (12): 1818-1824.doi: 10.19562/j.chinasae.qcgc.2022.12.003

所属专题: 智能网联汽车技术专题-感知&HMI&测评2022年

• • 上一篇    下一篇

基于空洞空间池化金字塔的自动驾驶图像语义分割方法

王大方1,刘磊1,曹江1(),赵刚1,赵文硕1,唐伟2()   

  1. 1.哈尔滨工业大学(威海)汽车工程学院,威海  264200
    2.陆军装甲兵学院兵器与控制系,北京  100072
  • 收稿日期:2022-06-21 修回日期:2022-07-14 出版日期:2022-12-25 发布日期:2022-12-22
  • 通讯作者: 曹江,唐伟 E-mail:1964611621@qq.com;630266501@qq.com
  • 基金资助:
    哈尔滨工业大学重大科研项目培育计划(ZDXMPY20180109)

Semantic Segmentation Method of Autonomous Driving Images Based on Atrous Spatial Pyramid Pooling

Dafang Wang1,Lei Liu1,Jiang Cao1(),Gang Zhao1,Wenshuo Zhao1,Wei Tang2()   

  1. 1.School of Automotive Engineering,Harbin Institute of Technology,Weihai  264200
    2.Department of Arms and Control,Army Academy of Armored Forces,Beijing  100072
  • Received:2022-06-21 Revised:2022-07-14 Online:2022-12-25 Published:2022-12-22
  • Contact: Jiang Cao,Wei Tang E-mail:1964611621@qq.com;630266501@qq.com

摘要:

如果车辆在道路上能精确而快速地理解人和车的语义,就能在很大程度上对障碍躲避、路径规划等做出指导。现有的基于深度学习的语义分割方法存在分割速度和分割精度不能兼得等问题。本文在现有语义分割网络的基础上,通过在特征提取基准网络后添加空洞空间池化金字塔结构,可以获取图像的多尺度语义信息。实验结果表明,文中提出的A_ASPP_1和A_ASPP_2两个模块能对自动驾驶场景中常见的人和各类车辆图像进行有效的分割。对应的两种改进的网络结构虽然分割速度稍有降低,但其训练结果的平均交并比相比现有双分支网络BiSeNet分别提升了2.1和1.2个百分点。

关键词: 语义分割, 自动驾驶, 空洞空间池化金字塔

Abstract:

If a vehicle can accurately and quickly understand the semantics of people and vehicles on the road, it can guide the obstacle avoidance and path planning to a large extent. The existing semantic segmentation methods based on deep learning need a tradeoff between segmentation speed and segmentation accuracy. In this paper, based on the existing semantic segmentation network, the multi-scale semantic information of image can be obtained by adding an atrous spatial pyramid pooling structure after the reference network of feature extraction. Experimental results show that modules A_ASPP_1 and A_ASPP_2 proposed can effectively segment images of common people and various vehicles in automatic driving scenes. Compared with BiSeNet, two corresponding improved network structures have 2.1 and 1.2 percentage points higher mean intersection over union of training results respectively, though with a little lower segmentation speed.

Key words: semantic segmentation, autonomous driving, atrous spatial pyramid pooling