汽车工程 ›› 2021, Vol. 43 ›› Issue (1): 59-67.doi: 10.19562/j.chinasae.qcgc.2021.01.008

• • 上一篇    下一篇

基于模仿学习和强化学习的智能车辆换道行为决策

宋晓琳(),盛鑫,曹昊天,李明俊,易滨林,黄智   

  1. 湖南大学,汽车车身先进设计与制造国家重点实验室,长沙 410082
  • 收稿日期:2020-06-17 修回日期:2020-08-06 出版日期:2021-01-25 发布日期:2021-02-03
  • 通讯作者: 宋晓琳 E-mail:jqysxl@hnu.edu.cn
  • 基金资助:
    国家自然科学基金(51975194);国家自然科学基金青年科学基金(51905161)

Lane‑change Behavior Decision‑making of Intelligent Vehicle Based on Imitation Learning and Reinforcement Learning

Xiaolin Song(),Xin Sheng,Haotian Cao,Mingjun Li,Binlin Huang Zhi Yi   

  1. Hunan University,State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body,Changsha 410082
  • Received:2020-06-17 Revised:2020-08-06 Online:2021-01-25 Published:2021-02-03
  • Contact: Xiaolin Song E-mail:jqysxl@hnu.edu.cn

摘要:

本文中提出了一种基于模仿学习和强化学习的智能车辆换道行为决策方法。其中宏观决策模块通过模仿学习构建极端梯度提升模型,根据输入信息在车道保持、左换道和右换道中选择宏观决策指令,以此确定所需求解的换道行为决策子问题;各细化决策子模块通过深度确定性策略梯度强化学习方法得到优化策略,求解相应换道行为决策子问题,以确定车辆运动目标位置并下发执行。仿真结果表明:本文中提出方法的策略学习速度比单纯强化学习方法快,且其综合性能优于有限状态机、行为克隆模仿学习和单纯强化学习等方法。

关键词: 智能车辆, 行为决策, 强化学习, 模仿学习

Abstract:

A lane?change behavior decision?making method of the intelligent vehicle is proposed based on imitation learning and reinforcement learning, in which the macro decision?making module constructs the extreme gradient boosting model through imitation learning, and selects the macro instructions (lane?keeping, left lane?change and right lane?change) according to the input information, so as to determine the sub?problem of lane?change behavior decision that need to be solved. Each detailed decision?making sub?module acquires its optimized strategy through the reinforcement learning of deep deterministic strategy gradient to solve the corresponding sub?problem for determining the movement target position of ego?vehicle and sending to lower?level modules for execution. Simulation results show that the strategy learning speed of the proposed method is faster than that of pure reinforcement learning, and its comprehensive performance is better than that of finite state machine, behavior clone imitation learning and pure reinforcement learning.

Key words: intelligent vehicle, behavior decision?making, reinforcement learning, imitation learning