汽车工程 ›› 2023, Vol. 45 ›› Issue (8): 1343-1352.doi: 10.19562/j.chinasae.qcgc.2023.08.005

所属专题: 智能网联汽车技术专题-规划&决策2023年

• • 上一篇    下一篇

基于深度学习的端到端车辆运动规划方法研究

刘卫国1,2,项志宇1(),刘锐2,李国栋3,王子旭2   

  1. 1.浙江大学信息与电子工程学院,杭州 310058
    2.国家智能网联汽车创新中心,北京 100160
    3.重庆理工大学车辆工程学院,重庆 400054
  • 收稿日期:2023-04-28 修回日期:2023-06-18 出版日期:2023-08-25 发布日期:2023-08-17
  • 通讯作者: 项志宇 E-mail:xiangzy@zyu.edu.cn
  • 基金资助:
    自动驾驶国家新一代人工智能开放创新平台项目(2020AAA0103702)

Research on End-to-End Vehicle Motion Planning Method Based on Deep Learning

Weiguo Liu1,2,Zhiyu Xiang1(),Rui Liu2,Guodong Li3,Zixu Wang2   

  1. 1.ZJU College of Information Science & Electronic Engineerings,Hangzhou  310058
    2.National Innovation Center of Intelligent and Connected Vehicles,Beijing  100106
    3.School of Vehicle Engineering CQUT,Chongqing  400054
  • Received:2023-04-28 Revised:2023-06-18 Online:2023-08-25 Published:2023-08-17
  • Contact: Zhiyu Xiang E-mail:xiangzy@zyu.edu.cn

摘要:

在现有端到端的深度学习自动驾驶框架中,普遍存在规划控制预测精度低的问题,这往往是由于输入数据源单一、无法兼顾时间和空间信息而导致的。为更好地体现虚拟仿真测试中自车与环境、交通参与者的历史交互过程对当前时刻决策的影响,本文设计了一种用于自动驾驶仿真环境下车辆运动规划任务的多级时空注意力长短期记忆网络。该算法提取和表征自动驾驶环境的深层抽象信息,并在仿真平台中实现端到端的车辆运动控制。首先,将前视摄像头模型获取的RGB仿真数据的历史连续视频帧序列作为输入,使用卷积模块提取单一时刻图像的空间特征;其次,使用LSTM模块融合图像历史时刻的空间信息,从而获得时间上下文特征。同时,为提高对时空关键信息的提取能力并加速网络收敛,本文在多级时空特征的融合部分采用了时空注意力机制。本研究在Carla仿真平台上进行了测试验证,实验结果表明本文所提出的方法相比单一时空算法更能精确地模仿人类驾驶决策行为。

关键词: 车辆运动规划, 端到端, 时空注意力, 深度学习, 仿真, LSTM

Abstract:

In existing end-to-end deep learning-based autonomous driving frameworks, there is a common problem of low accuracy in planning and control prediction, often due to the single-source input data and inability to balance spatial and temporal information. To better reflect the impact of the historical interaction process between the ego vehicle, environment, and traffic participants on the current decision-making in virtual simulation testing, this paper designs a multi-level spatiotemporal attention long short-term memory network for vehicle motion planning in autonomous driving simulation environment. The algorithm extracts and represents deep abstract information of the autonomous driving environment and realizes end-to-end vehicle motion control in the simulation platform. Firstly, a convolutional module is used to extract spatial features of a single image at a specific moment using the historical continuous video frame sequence of RGB simulation data acquired by the forward-facing camera model as input. Secondly, the LSTM module is used to fuse the spatial information of the image across historical moment to obtain temporal contextual features. Additionally, to enhance the ability to extract spatiotemporal key information and accelerate network convergence, a spatiotemporal attention mechanism is applied in the fusion part of the multi-level spatiotemporal features. The proposed method is tested and validated on the Carla simulation platform. The experimental results show that the proposed method can more accurately simulate human driving decision-making behavior compared to the single spatiotemporal algorithm.

Key words: vehicle motion planning, end-to-end, space-time attention, deep learning, simulation, LSTM