汽车工程 ›› 2024, Vol. 46 ›› Issue (5): 882-892.doi: 10.19562/j.chinasae.qcgc.2024.ep.001

• • 上一篇    

基于混合专家模型的智能网联汽车换道决策方法

姚福星1,孙超1,兰云港2,卢兵3(),王博3(),于海洋4   

  1. 1.北京理工大学机械与车辆学院,北京 100081
    2.深圳市昊岳科技有限公司,深圳 518000
    3.北京理工大学深圳汽车研究院,深圳 518122
    4.北京航空航天大学前沿科学技术创新研究院,北京 100191
  • 收稿日期:2024-03-10 修回日期:2024-04-01 出版日期:2024-05-25 发布日期:2024-05-17
  • 通讯作者: 卢兵,王博 E-mail:lubingev@sina.com;wangbo@szari.ac.cn
  • 基金资助:
    国家重点研发计划项目(2022YFB2503203)

A Lane Change Decision Method for Intelligent Connected Vehicles Based on Mixture of Expert Model

Fuxing Yao1,Chao Sun1,Yungang Lan2,Bing Lu3(),Bo Wang3(),Haiyang Yu4   

  1. 1.School of Mechanical Engineering,Beijing Institute of Technology,Beijing  100081
    2.ShenZhen Boundless Sensor Technology Co. ,Ltd. ,Shenzhen  518000
    3.Shenzhen Automotive Research Institute of Beijing Institute of Technology,Shenzhen  518122
    4.School of Transportation Science and Engineering,Beihang University,Beijing  100191
  • Received:2024-03-10 Revised:2024-04-01 Online:2024-05-25 Published:2024-05-17
  • Contact: Bing Lu,Bo Wang E-mail:lubingev@sina.com;wangbo@szari.ac.cn

摘要:

高速公路换道决策问题场景复杂、不确定性强、实时性要求高,是国内外自动驾驶领域的研究热点和难点。深度强化学习(deep reinforcement learning,DRL)具有良好的决策实时性和面向复杂场景的适应性,然而,在训练样本与成本有限的条件下学习效果有限,其难以保证最优的驾驶效率和完全的行驶安全性。本文提出了一种基于改进DRL的混合专家模型(DRL-mixture of expert,DRL-MOE)换道决策方法。首先,模型的上层分类器根据输入状态特征动态地决定下层DRL专家或启发式专家的激活状态。为提高DRL专家的学习效果,本方法通过行为克隆(behavior cloning,BC)对神经网络参数进行初始化,对传统深度确定性策略梯度算法(deep deterministic policy gradient,DDPG)进行了改进。将智能驾驶员模型和最小化换道引起的总制动策略设计为启发式专家,以确保行驶安全性。仿真结果表明,本文所提出的DRL-MOE模型方法与非混合专家型DRL方法相比,在驾驶效率方面提高了15.04%,并确保了零碰撞和零出界,具有较高的鲁棒性和更优的效果。

关键词: 自动驾驶, 高速换道决策, 深度强化学习, 混合专家模型

Abstract:

The problem of lane-changing decision-making on highways,characterized by complex scenarios,strong uncertainty,and high real-time requirements,is a research hotspot and challenge in the field of autonomous driving both domestically and internationally. Deep Reinforcement Learning (DRL) exhibits excellent real-time decision-making capabilities and adaptability to complex scenarios. However,under the constraints of limited training samples and cost,its learning effectiveness remains limited,making it difficult to ensure optimal driving efficiency and complete driving safety. In this paper, a DRL-Mixture of Expert (DRL-MOE) lane-changing decision-making method based on the improved DRL model is proposed. Firstly,the upper-level classifier dynamically determines the activation status of the lower-level DRL expert or heuristic expert based on the input state features. Then, to enhance the learning effectiveness of the DRL expert,the method utilizes Behavior Cloning (BC) for initializing the neural network parameters to make improvements on the traditional Deep Deterministic Policy Gradient (DDPG) algorithm. Finally, the Intelligent Driver Model (IDM) and the strategy of Minimizing Overall Braking Induced by Lane changes (MOBIL) are designed as heuristic experts to ensure driving safety. The simulation results show that compared to non-mixed expert DRL methods,the proposed DRL-MOE model improves driving efficiency by 15.04%,ensuring zero collisions and zero departures,demonstrating higher robustness and superior performance.

Key words: autonomous driving, high speed lane change decision-making, deep reinforcement learning, mixture of expert