Administrator by China Associction for Science and Technology
Sponsored by China Society of Automotive Engineers
Published by AUTO FAN Magazine Co. Ltd.

Automotive Engineering ›› 2023, Vol. 45 ›› Issue (10): 1779-1790.doi: 10.19562/j.chinasae.qcgc.2023.10.001

Special Issue: 智能网联汽车技术专题-感知&HMI&测评2023年

    Next Articles

Pedestrian Crossing Intention Prediction Method Based on Multimodal Feature Fusion

Long Chen1,Chen Yang1,Yingfeng Cai1(),Hai Wang2,Yicheng Li2   

  1. 1.Institute of Automotive Engineering,Jiangsu University,Zhenjiang  212013
    2.School of Automotive and Traffic Engineering,Jiangsu University,Zhenjiang  212013
  • Received:2023-02-13 Revised:2023-03-14 Online:2023-10-25 Published:2023-10-23
  • Contact: Yingfeng Cai E-mail:caicaixiao0304@126.com

Abstract:

Pedestrian behavior prediction is one of the main challenges faced by urban environment intelligent vehicle decision planning system. It is of great significance to improve the prediction accuracy of pedestrian crossing intention for driving safety. In view of the problems that the existing methods rely too much on the location information of pedestrian boundary box, and rarely consider the environmental information in traffic scenes and the interaction between traffic objects, a pedestrian crossing intention prediction method based on multi-modal feature fusion is proposed. In this paper, a new global scene context information extraction module and a local scene spatiotemporal feature extraction module are constructed by combining multiple attention mechanisms to enhance its ability to extract spatiotemporal features of the scene around the vehicle, and rely on the semantic analysis results of the scene to capture the interaction between pedestrians and their surroundings, which solves the problem of insufficient application of the interactive information between the context information of the traffic environment and the traffic objects. In addition, a multimodal feature fusion module based on hybrid fusion strategy is designed in this paper, which realizes the joint reasoning of visual features and motion features according to the complexity of different information sources, and provides reliable information for pedestrian crossing intention prediction module. The test based on JAAD dataset shows that the prediction accuracy of the proposed method is 0.84, which is 10.5 % higher than that of the baseline method. Compared with existing models of the same type, the proposed method has the best comprehensive performance and has a wider application scenario.

Key words: autonomous vehicles, pedestrian intention prediction, multimodal feature fusion, attention mechanism