汽车工程 ›› 2025, Vol. 47 ›› Issue (8): 1490-1500.doi: 10.19562/j.chinasae.qcgc.2025.08.006

• • 上一篇    

自动驾驶车辆轨迹跟踪避撞的扩散强化学习方法研究

赵俊杰1,王以诺2,吴江1,吴思潮1,邹昌迪1,王洪达1,李升波2,马飞1,段京良1()   

  1. 1.北京科技大学机械工程学院,北京 100083
    2.清华大学车辆与运载学院,北京 100084
  • 收稿日期:2024-11-27 修回日期:2025-01-08 出版日期:2025-08-25 发布日期:2025-08-18
  • 通讯作者: 段京良 E-mail:duanjl@ustb.edu.cn
  • 基金资助:
    国家自然科学基金(52202487);国家自然科学基金(62273256);中央高校基本科研业务费专项资金项目(FRF-OT-23-02)

Research on Diffusion Reinforcement Learning Method for Vehicle Trajectory Tracking and Collision Avoidance of Autonomous Vehicles

Junjie Zhao1,Yinuo Wang2,Jiang Wu1,Sichao Wu1,Changdi Zou1,Hongda Wang1,ShengboEben Li2,Fei Ma1,Jingliang Duan1()   

  1. 1.School of Mechanical Engineering,University of Science and Technology Beijing,Beijing 100083
    2.School of Vehicle and Mobility,Tsinghua University,Beijing 100084
  • Received:2024-11-27 Revised:2025-01-08 Online:2025-08-25 Published:2025-08-18
  • Contact: Jingliang Duan E-mail:duanjl@ustb.edu.cn

摘要:

自动驾驶汽车的智能化是推进汽车产业转型升级的关键,其中轨迹跟踪避撞技术对确保自动驾驶汽车行驶安全至关重要。本研究针对现有强化学习型控制方法探索不充分问题,提出了一种扩散型强化学习算法。通过将扩散模型与强化学习框架相结合,把传统策略网络替换为扩散式生成策略网络,将扩散模型的多模态分布匹配能力引入强化学习中,并与值分布柔性执行-评价算法结合,提出了扩散型值分布执行-评价算法。仿真与实车试验表明,所提算法展现出较高的探索效率,实车横向平均跟踪误差小于0.03 m,速度平均跟踪误差小于0.05 m/s,验证了算法的优越性。

关键词: 轨迹跟踪, 主动避撞, 值分布强化学习, 扩散模型

Abstract:

The intelligence of autonomous vehicles is key to upgrading of the automotive industry, where trajectory tracking and collision avoidance technologies are crucial for ensuring vehicle safety. In this paper, for the problem of insufficient exploration of existing reinforcement learning control methods, a diffusion reinforcement learning algorithm is proposed. By combining diffusion models with reinforcement learning frameworks and replacing traditional policy networks with diffusion generative policy networks, introducing the multimodal distribution matching capability of diffusion models into reinforcement learning, and combining it with the distributional soft actor-critic algorithm, a diffusion distributional actor-critic algorithm (DDAC) is proposed. Simulation and real-vehicle experiments demonstrate that the proposed algorithm exhibits high exploration efficiency, with real vehicle lateral tracking error less than 0.03 m and velocity tracking error less than 0.05 m/s, verifying the superiority of the algorithm.

Key words: trajectory tracking, active collision avoidance, distributional reinforcement learning, diffusion model