Administrator by China Associction for Science and Technology
Sponsored by China Society of Automotive Engineers
Published by AUTO FAN Magazine Co. Ltd.

Automotive Engineering ›› 2023, Vol. 45 ›› Issue (9): 1617-1625.doi: 10.19562/j.chinasae.qcgc.2023.09.010

Special Issue: 智能网联汽车技术专题-感知&HMI&测评2023年

Previous Articles     Next Articles

Lightweight Semantic Segmentation Method Based on Local Window Cross Attention

Zuliang Jin1,Hanbing Wei1(),Liu Zheng1,2,Lu Lou1,Guofeng Zheng1   

  1. 1.School of Electromechanical and Vehicle Engineering,Chongqing Jiaotong University,Chongqing  400074
    2.University of British Columbia Okanagan,Kelowna,BC,Canada
  • Received:2022-11-28 Revised:2023-01-03 Online:2023-09-25 Published:2023-09-23
  • Contact: Hanbing Wei E-mail:hbwei@cqjtu.edu.cn

Abstract:

For the environmental perception of autonomous vehicle, the application of circumnavigation cameras in the Bird's Eye View (BEV) coordinate for semantic segmentation of lanes, vehicles and other targets has attracted wide attention. For the problems of linear increase of task inference delay due to the increasing number of cameras as well as difficulty in completing semantic segmentation tasks in real-time in autonomous driving perception, this paper proposes a lightweight semantic segmentation method based on local window cross-attention. The model adopts the improved EdgeNeXt backbone network to extract features. By constructing the local window cross attention between BEV query and image features, the feature query between the cross-camera perspectives is constructed. Then, the fused BEV feature map is decoded by up sampling residual block to obtain the BEV semantic segmentation results. The experimental results on the nuScenes dataset show that the proposed method achieves 35.1% mIoU in the lane static segmentation task of BEV map, which is 2.2% higher than that of HDMapNet. In particular, the inference speed increases by 58.2% compared with that of GKT, with the frame detection rate reaching 106 FPS.

Key words: BEV, semantic segmentation, local window, cross-attention