汽车工程 ›› 2024, Vol. 46 ›› Issue (11): 1973-1982.doi: 10.19562/j.chinasae.qcgc.2024.11.004

• • 上一篇    下一篇

基于多信息融合网络的行人轨迹预测方法

高嵩1,2,周江邻2,高博麟1(),芦健2,王鹤2,徐月云2   

  1. 1.清华大学车辆与运载学院,北京 100000
    2.国家智能网联汽车创新中心,北京 102600
  • 收稿日期:2024-05-28 修回日期:2024-07-01 出版日期:2024-11-25 发布日期:2024-11-22
  • 通讯作者: 高博麟 E-mail:gaobolin@tsinghua.edu.cn

Pedestrian Trajectory Prediction Method Based on Multi-information Fusion Network

Song Gao1,2,Jianglin Zhou2,Bolin Gao1(),Jian Lu2,He Wang2,Yueyun Xu2   

  1. 1.School of Vehicles and Mobility,Tsinghua University,Beijing 100000
    2.National Innovation Center of Intelligent and Connected Vehicles,Beijing 102600
  • Received:2024-05-28 Revised:2024-07-01 Online:2024-11-25 Published:2024-11-22
  • Contact: Bolin Gao E-mail:gaobolin@tsinghua.edu.cn

摘要:

随着自动驾驶技术的不断发展,准确预测行人的未来轨迹已经成为确保系统安全和可靠的关键要素。然而,现有行人轨迹预测研究多数依赖于固定摄像头视角,进而限制了对行人运动的全面观测,因此难以直接应用于自动驾驶车辆自车视角(ego-vehicle)下的行人轨迹预测。针对该问题,本文提出了一种基于多行人信息融合网络(MPIFN)的自车视角行人轨迹预测方法。该方法通过融合社会信息、局部环境信息和行人时间信息,实现了对行人未来轨迹的准确预测。本文构建了一个局部环境信息提取模块,结合了可形变卷积与传统卷积和池化操作,旨在更有效地提取复杂环境中的局部信息。该模块通过动态调整卷积核的位置,增强了模型对不规则和复杂形状的适应能力。同时,构建了行人时空信息提取模块和多模态特征融合模块,以实现对社会信息与环境信息的充分融合。实验结果表明,该方法在JAAD和PSI两个自车视角下驾驶数据集上均取得了先进的性能。在JAAD数据集上,累积均方误差(CF_MSE)为4 063,累积平均均方误差(C_MSE)为829。在PSI数据集上平均相对偏差(ARB)和最终相对偏差(FRB)也分别在预测时间为0.5、1.0、1.5 s时取得了18.08、29.21、44.98和25.27、54.62、93.09的优异表现。

关键词: 自动驾驶, 行人轨迹预测, 多行人信息融合网络, 自车视角

Abstract:

With the continuous development of autonomous driving technology, accurately predicting the future trajectories of pedestrians has become a critical element in ensuring system safety and reliability. However, most existing studies on pedestrian trajectory prediction rely on fixed camera perspectives, which limits the comprehensive observation of pedestrian movement and thus makes them unsuitable for direct application to pedestrian trajectory prediction under the ego-vehicle perspective in autonomous vehicles. To solve the problem, in this paper a pedestrian trajectory prediction method under the ego-vehicle perspective based on the Multi-Pedestrian Information Fusion Network (MPIFN) is proposed, which achieves accurate prediction of pedestrians' future trajectories by integrating social information, local environmental information, and temporal information of pedestrians. In this paper, a Local Environmental Information Extraction Module that combines deformable convolution with traditional convolutional and pooling operations is constructed, aiming to more effectively extract local information from complex environment. By dynamically adjusting the position of convolutional kernels, this module enhances the model’s adaptability to irregular and complex shapes. Meanwhile, the pedestrian spatiotemporal information extraction module and multimodal feature fusion module are developed to facilitate comprehensive integration of social and environmental information. The experimental results show that the proposed method achieves advanced performance on two ego-vehicle driving datasets, JAAD and PSI. Specifically, on the JAAD dataset, the Center Final Mean Squared Error (CF_MSE) is 4 063, and the Center Mean Squared Error (C_MSE) is 829. On the PSI dataset, the Average Root Mean Square Error (ARB) and Final Root Mean Square Error (FRB) also achieve outstanding performance with values of 18.08/29.21/44.98 and 25.27/54.62/93.09 for prediction horizons of 0.5 s, 1.0 s, and 1.5 s, respectively.

Key words: autonomous driving, pedestrian trajectory prediction, multiple pedestrian information fusion network, ego-vehicle