汽车工程 ›› 2022, Vol. 44 ›› Issue (3): 340-349.doi: 10.19562/j.chinasae.qcgc.2022.03.005

所属专题: 智能网联汽车技术专题-感知&HMI&测评2022年

• • 上一篇    下一篇

基于多传感器信息融合的3维目标实时检测

谢德胜,徐友春,陆峰(),潘世举   

  1. 陆军军事交通学院军事交通运输研究所,天津  300161
  • 收稿日期:2021-10-09 修回日期:2021-11-03 出版日期:2022-03-25 发布日期:2022-03-25
  • 通讯作者: 陆峰 E-mail:1849048346@qq.com
  • 基金资助:
    军队重点学科专业建设项目(智能无人系统关键技术前沿跟踪研究)资助

Real-time Detection of 3D Objects Based on Multi-Sensor Information Fusion

Desheng Xie,Youchun Xu,Feng Lu(),Shiju Pan   

  1. Institute of Military Transportation,Army Military Transportation University,Tianjin  300161
  • Received:2021-10-09 Revised:2021-11-03 Online:2022-03-25 Published:2022-03-25
  • Contact: Feng Lu E-mail:1849048346@qq.com

摘要:

针对基于多传感器信息融合的3维目标检测,提出了一种实时高精度的双阶段深度神经网络PointRGBNet。第1阶段,在区域提案网络中,首先将3维点云投影到2维图像上生成6维RGB点云,然后对输入的6维RGB点云进行特征提取,得到低维特征图与高维特征图,利用融合后的特征图生成大量置信度较高的提案;第2阶段,在目标检测网络中,利用第1阶段生成的提案进行RoI池化,得到特征图上与每个提案对应的特征集合,通过针对性地学习提案的特征集合,实现了更精准的3维目标检测。在KITTI数据集上的公开测试结果表明,PointRGBNet在检测精度上不仅优于仅使用2维图像或3维点云的目标检测网络,甚至优于某些先进的多传感器信息融合网络,而且整个网络的目标检测速度为12帧/s,满足实时性要求。

关键词: 目标检测, 2维图像, 3维点云, 深度神经网络

Abstract:

Aiming at the 3D objects detection based on multi-sensor information fusion, a high-accuracy real-time two-stage deep neural network PointRGBNet is proposed. In the first stage with regional proposal network, 3D point clouds are firstly projected onto 2D image to generate 6D RGB point clouds, then feature extraction is performed on the 6D RGB point clouds input to obtain low-dimensional feature map and high-dimensional feature map which are then fused for generating a large number of proposals with high confidence. In the second stage with object detection network, the proposals generated in the first stage are used for RoI pooling to obtain the feature collection corresponding to each proposal from feature map, and more accurate 3D object detection is achieved by targetedly learning the feature collection of proposals. The results of open test on KITTI dataset show that PointRGBNet is not only better than the object detection networks using only 2D images or 3D point clouds in detection accuracy, even better than some advanced multi-sensor information fusion networks, but also has a high object detection speed of the entire network up to 12 frame/s, meeting the real-time requirements.

Key words: object detection, 2D images, 3D point clouds, deep neural networks