Administrator by China Associction for Science and Technology
Sponsored by China Society of Automotive Engineers
Published by AUTO FAN Magazine Co. Ltd.

Automotive Engineering ›› 2025, Vol. 47 ›› Issue (9): 1674-1685.doi: 10.19562/j.chinasae.qcgc.2025.09.004

Previous Articles    

End-to-End Decision-Making Model Based on Reinforcement Learning Incorporating Bird's Eye View Representation

Baixue Tang1,Yingfeng Cai1(),Long Chen1,Hai Wang2,Zhongyu Rao1,Ze Liu1   

  1. 1.Institute of Automotive Engineering,Jiangsu University,Zhenjiang 212013
    2.School of Automotive and Traffic Engineering,Jiangsu University,Zhenjiang 212013
  • Received:2024-12-20 Revised:2025-02-18 Online:2025-09-25 Published:2025-09-19
  • Contact: Yingfeng Cai E-mail:caicaixiao0304@126.com

Abstract:

End-to-end autonomous driving decision-making and planning models are a hot research direction in the industry. The spatial and temporal inconsistency between sensor signals and action outputs, as well as the convergence issues of end-to-end models, greatly limit the practical application effectiveness of these models. Therefore, in this paper an end-to-end reinforcement learning model called FB-Roach is proposed that integrates bird's-eye view prediction. Environmental information representation is established through a bird's-eye view prediction model. A forward projection module centered on a static Look-Up table, as well as a multi-task backward projection module that integrates temporal information, depth embedding, and semantic embedding, is designed to ensure the consistency between input signals and output actions. Furthermore, by innovatively incorporating the attention mechanism, the non-recurrent deep network architecture is proposed that effectively fuses bird's-eye view and vehicle state information. The model's action output is optimized using the PPO reinforcement learning algorithm to achieve intelligent decision-making and control for autonomous vehicles. Based on the CARLA simulator, a variety of quantitative evaluation indicators are constructed under different benchmarks. The experiments results show that the proposed algorithm outperforms current mainstream algorithms in terms of model convergence speed and driving decision safety.

Key words: end-to-end autonomous vehicles, BEV, reinforcement learning, decision-making