Administrator by China Associction for Science and Technology
Sponsored by China Society of Automotive Engineers
Published by AUTO FAN Magazine Co. Ltd.

Automotive Engineering ›› 2024, Vol. 46 ›› Issue (9): 1608-1616.doi: 10.19562/j.chinasae.qcgc.2024.09.008

Previous Articles    

A LiDAR-Based Dynamic Driving Scene Multi-task Segmentation Network

Hai Wang1(),Jianguo Li1,Yingfeng Cai2,Long Chen2   

  1. 1.School of Automotive and Traffic Engineering,Jiangsu University,Zhenjiang 212013
    2.Automotive Engineering Research Institute,Jiangsu University,Zhenjiang 212013
  • Received:2024-01-23 Revised:2024-04-23 Online:2024-09-25 Published:2024-09-19
  • Contact: Hai Wang E-mail:wanghai1019@163.com

Abstract:

In autonomous driving scene understanding task, accurate segmentation of drivable areas, dynamic and static objects is essential for subsequent local motion planning and motion control. However, the current general semantic segmentation method based on lidar point cloud cannot achieve real-time and robust prediction on vehicle-end edge computing devices, and cannot predict the motion state of objects at the current moment. In order to solve this problem, a multi-task segmentation network MultiSegNet for driving areas and dynamic and static objects is proposed in this paper. The network uses the depth map output by the lidar and the processed residual image as the representation of the encoded spatial features and motion features to input to the network for feature learning, so as to avoid directly processing disordered high-density point clouds. For the large difference in the number of target distributions in different directions of the depth map, a variable resolution grouping input strategy is proposed, which can reduce the amount of network computation and improve the segmentation accuracy of the network. In order to adapt to the size of the convolutional receptive field required for targets at different scales, a depth-value-guided hierarchical dilated convolution module is proposed. At the same time, in order to effectively correlate and fuse the spatial position and attitude information of objects in different time domains, a spatiotemporal motion feature enhancement network is proposed. The effectiveness of the proposed MultiSegNet is verified on the large-scale point cloud driving scene datasets SemanticKITTI and nuScenes. The results show that the segmentation IoU of driving area, static object and dynamic object reaches 98%, 97% and 70%, respectively, which is better than that of mainstream networks, with real-time inference realized on edge computing devices.

Key words: autonomous driving, lidar, multi-task point cloud segmentation network, dynamic object segmentation