类脑学习型自动驾驶决控系统的关键技术

doi:10.19562/j.chinasae.qcgc.2023.ep.006

Abstract

Abstract:

As the technical trend of the next generation of high-level autonomous driving， brain-inspired learning is a class of methods that employ deep neural networks （DNN） as the strategy carrier and reinforcement learning （RL） as the training algorithm to realize strategy self evolution through continuous interaction with traffic environments， ultimately obtaining the optimal mapping from the environmental state to execution action. Currently， brain-inspired learning is mainly applied in decision-making and motion control modules of autonomous driving. Its key technologies include how to design its system framework to support interactive training， high-fidelity autonomous driving simulation platform， accurate and flexible representation of environment statues， multiple dimensional evaluation metrics， and effective training algorithm that drives policy updates. This paper systematically summarizes the history and future trends of decision-making and control functionalities in autonomous vehicles， including two main modular architectures （HDC， i.e.， hierarchical decision & control and IDC， i.e.， integrated decision & control） and three mainstream technical solutions （i.e.， rule-based design， supervised learning， and brain-inspired learning）. An overview of autonomous driving simulation platforms are briefly introduced， followed by three effective designing methods for representing traffic environment states （i.e.， object-based design， feature-based design， and combined design）. The paper also introduces multiple dimensional evaluation metrics for autonomous vehicles， which can describe self-driving performances including driving safety， regulatory compliance， driving comfort， travel efficiency， energy efficiency. Typical reinforcement learning algorithms， including their design principles， taxonomy， and algorithm performances， are introduced， as well as their application on brain-inspired autonomous driving systems in the systematic design of road-cloud cooperation.

Key words: intelligent and connected vehicle, vehicle-road-cloud cooperation, brain-inspired learning, decision-making, motion control

Shengbo Eben Li,Guojian Zhan,Yuxuan Jiang,Zhiqian Lan,Yuhang Zhang,Wenjun Zou,Chen Chen,Bo Cheng,Keqiang Li. Key Technologies of Brain-Inspired Decision and Control Intelligence for Autonomous Driving Systems[J].Automotive Engineering, 2023, 45(9): 1499-1515.

Figures/Tables 16

References 65

1	李升波，关阳，侯廉，等. 深度神经网络的关键技术及其在自动驾驶领域的应用［J］. 汽车安全与节能学报， 2019， 10（2）： 119-145.
	LI S E， GUAN Y， HOU L， et al. Key technique of deep neural network and its applications in autonomous driving［J］. Journal of Automotive Safety and Energy， 2019， 10（2）： 119.
2	HANCOCK P A， NOURBAKHSH I， STEWART J. On the future of transportation in an era of automated and autonomous vehicles［J］. Proceedings of the National Academy of Sciences， 2019， 116（16）： 7684-7691.
3	KALRA N， PADDOCK S M. Driving to safety： how many miles of driving would it take to demonstrate autonomous vehicle reliability［J］. Transportation Research Part A： Policy and Practice， 2016， 94： 182-193.
4	李克强，戴一凡，李升波，等. 智能网联汽车（ICV）技术的发展现状及趋势［J］. 汽车安全与节能学报， 2017， 8（1）： 1-14.
	LI K Q， DAI Y F， LI S E， et al. State-of-the-art and technical trends of intelligent and connected vehicles［J］. Journal of Automotive Safety and Energy， 2017， 8（1）： 1-14.
5	丁飞，张楠，李升波，等. 智能网联车路云协同系统架构与关键技术研究综述［J］. 自动化学报， 2022， 48： 1-24.
	DING F， ZHANG N， LI S E， et al. A survey of architecture and key technologies of intelligent connected vehicle-road-cloud cooperation system［J］. Acta Automatica Sinica， 2022， 48： 1-24.
6	URMSON C， BAKER C， DOLAN J， et al. Autonomous driving in traffic： boss and the urban challenge［J］. AI Magazine， 2009， 30（2）： 17-17.
7	MONTEMERLO M， BECKER J， BHAT S， et al. Junior： the stanford entry in the urban challenge［J］. Journal of Field Robotics， 2008， 25（9）： 569-597.
8	BOJARSKI M， DEL TESTA D， DWORAKOWSKI D， et al. End to end learning for self-driving cars［J］. arXiv Preprint arXiv：， 2016.
9	VALLON C， ERCAN Z， CARVALHO A， et al. A machine learning approach for personalized autonomous lane change initiation and control［C］. Intelligent Vehicles Symposium （IV）. IEEE， 2017： 1590-1595.
10	RESCORLA R A. Behavioral studies of Pavlovian conditioning［J］. Annual Review of Neuroscience， 1988， 11（1）： 329-352.
11	THORNDIKE E L. Animal intelligence： experimental studies［M］. Transaction Publishers， 1911.
12	SCHULTZ W， DAYAN P， MONTAGUE P R. A neural substrate of prediction and reward［J］. Science， 1997， 275（5306）： 1593-1599.
13	LILLICRAP T P， HUNT J J， PRITZEL A， et al. Continuous control with deep reinforcement learning［C］. International Conference on Learning Representations （ICLR）. 2016.
14	GUAN Y， REN Y， SUN Q， et al. Integrated decision and control： toward interpretable and efficient driving intelligence［J］. IEEE Transactions on Cybernetics， 2022， 53（2）： 859-873.
15	GUAN Y， TANG L， LI C， et al. Integrated decision and control for high-level automated vehicles by mixed policy gradient and its experiment verification［J］. arXiv Preprint arXiv：， 2022.
16	JIANG J， REN Y， GUAN Y， et al. Integrated decision and control at multi-lane intersections with mixed traffic flow［J］. Journal of Physics， 2022， 2234（1）： 012015.
17	CAI P， SUN Y， CHEN Y， et al. Vision-based trajectory planning via imitation learning for autonomous vehicles［C］. International Conference on Intelligent Transportation Systems （ITSC）. IEEE， 2019： 2736-2742.
18	HOEL C J， WOLFF K， LAINE L. Automated speed and lane change decision making using deep reinforcement learning［C］. International Conference on Intelligent Transportation Systems （ITSC）. IEEE， 2018： 2148-2155.
19	YURTSEVER E， CAPITO L， REDMILL K， et al. Integrating deep reinforcement learning with model-based path planners for automated driving［C］. Intelligent Vehicles Symposium （IV）. IEEE， 2020： 1311-1316.
20	DUAN J， LI S E， GUAN Y， et al. Hierarchical reinforcement learning for self‐driving decision‐making without reliance on labelled driving data［J］. IET Intelligent Transport Systems， 2020， 14（5）： 297-305.
21	LIU Z， DUAN J， WANG W， et al. Recurrent model predictive control： learning an explicit recurrent controller for nonlinear systems［J］. IEEE Transactions on Industrial Electronics， 2022， 69（10）： 10437-10446.
22	LIN Z， DUAN J， LI S E， et al. Policy-iteration-based finite-horizon approximate dynamic programming for continuous-time nonlinear optimal control［J］. IEEE Transactions on Neural Networks and Learning Systems， 2022.
23	REN Y， JIANG J， ZHAN G， et al. Self-learned intelligence for integrated decision and control of automated vehicles at signalized intersections ［J］. IEEE Transactions on Intelligent Transportation Systems， 2022， 23（12）： 24145-24156.
24	GU Z， YIN Y， LI S E， et al. Integrated eco-driving automation of intelligent vehicles in multi-lane scenario via model-accelerated reinforcement learning ［J］. Transportation Research Part C： Emerging Technologies， 2022， 144： 103863.
25	GUAN Y， REN Y， MA H， et al. Learn collision-free self-driving skills at urban intersections with model-based reinforcement learning［C］. International Conference on Intelligent Transportation Systems （ITSC）. IEEE， 2021： 3462-3469.
26	CHEN D， ZHOU B， KOLTUN V， et al. Learning by cheating［C］. Conference on Robot Learning （CoRL）. 2020： 66-75.
27	CHEN J， YUAN B， TOMIZUKA M. Model-free deep reinforcement learning for urban autonomous driving［C］. International Conference on Intelligent Transportation Systems （ITSC）. IEEE， 2019： 2765-2771.
28	PENG B， SUN Q， LI S E， et al. End-to-End autonomous driving through dueling double deep Q-network ［J］. Automotive Innovation， 2021， 4（3）： 328-337.
29	LI Q， PENG Z， FENG L， et al. Metadrive： composing diverse driving scenarios for generalizable reinforcement learning［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2022.
30	DUAN J， ZHANG F， LI S E， et al. Applications of distributional soft actor-critic in real-world autonomous driving［C］. International Conference on Computer， Control and Robotics （ICCCR）. IEEE， 2022： 109-114.
31	CHEN J， LI S E， TOMIZUKA M. Interpretable end-to-end urban autonomous driving with latent deep reinforcement learning［J］. IEEE Transactions on Intelligent Transportation Systems， 2022， 23（6）： 5068-5078.
32	LESORT T， DÍAZ-RODRÍGUEZ N， GOUDOU J F， et al. State representation learning for control： an overview［J］. Neural Networks， 2018， 108： 379-392.
33	DE BRUIN T， KOBER J， TUYLS K， et al. Integrating state representation learning into deep reinforcement learning［J］. IEEE Robotics and Automation Letters， 2018， 3（3）： 1394-1401.
34	DUAN J， YU D， LI S E， et al. Fixed-dimensional and permutation invariant state representation of autonomous driving［J］. IEEE Transactions on Intelligent Transportation Systems， 2021， 23（7）： 9518-9528.
35	ISELE D， RAHIMI R， COSGUN A， et al. Navigating occluded intersections with autonomous vehicles using deep reinforcement learning［C］. International Conference on Robotics and Automation （ICRA）. IEEE， 2018： 20342039.
36	GE Q， SUN Q， LI S E， et al. Numerically stable dynamic bicycle model for discrete-time control［C］. Intelligent Vehicles Symposium. IEEE， 2021： 128-134.
37	LI G， YANG Y， LI S E， et al. Decision making of autonomous vehicles in lane change scenarios： deep reinforcement learning approaches with risk awareness［J］. Transportation Research Part C： Emerging Technologies， 2022， 134： 103452.
38	REN Y， DUAN J， LI S E， et al. Improving generalization of reinforcement learning with minimax distributional soft actor-critic［C］. International Conference on Intelligent Transportation Systems （ITSC）. IEEE， 2020： 1-6.
39	LIN Z， DUAN J， LI S E， et al. Continuous-time finite-horizon ADP for automated vehicle controller design with high efficiency［C］. International Conference on Unmanned Systems （ICUS）. IEEE， 2020： 978-984.
40	XIN L， KONG Y， LI S E， et al. Enable faster and smoother spatio-temporal trajectory planning for autonomous vehicles in constrained dynamic environment［J］. Proceedings of the Institution of Mechanical Engineers， Part D： Journal of Automobile Engineering， 2021， 235（4）： 1101-1112.
41	YU D， MA H， LI S E， et al. Reachability constrained reinforcement learning［C］. International Conference on Machine Learning （ICML）. PMLR， 2022： 25636-25655.
42	LECUN Y， BENGIO Y， HINTON G. Deep learning［J］. Nature， 2015， 521（7553）： 436-444.
43	QI C R， SU H， MO K， et al. Pointnet： deep learning on point sets for 3D classification and segmentation［C］. Conference on Computer Vision and Pattern Recognition （CVPR）. IEEE， 2017： 652-660.
44	YU Y， SI X， HU C， et al. A review of recurrent neural networks： LSTM cells and network architectures［J］. Neural Computation， 2019， 31（7）： 1235-1270.
45	CRESWELL A， WHITE T， DUMOULIN V， et al. Generative adversarial networks： an overview［J］. IEEE Signal Processing Magazine， 2018， 35（1）： 53-65.
46	WANG Y， YAO H， ZHAO S. Auto-encoder based dimensionality reduction［J］. Neurocomputing， 2016， 184： 232-242.
47	KINGMA D P， WELLING M. An introduction to variational autoencoders［J］. Foundations and Trends in Machine Learning， 2019， 12（4）： 307-392.
48	MU Y M， CHEN S， DING M， et al. CtrlFormer： learning transferable state representation for visual control via transformer［C］. International Conference on Machine Learning （ICML）， 2022： 16043-16061.
49	ZAHEER M， KOTTUR S， RAVANBAKHSH S， et al. Deep sets［C］. Advances in Neural Information Processing Systems （NIPS）， 2017， 30.
50	MARON H， LITANY O， CHECHIK G， et al. On learning sets of symmetric elements［C］. International Conference on Machine Learning （ICML）， 2020： 6734-6744.
51	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］. Advances in Neural Information Processing Systems （NIPS）， 2017， 30.
52	FENG S， SUN H， YAN X， et al. Dense reinforcement learning for safety validation of autonomous vehicles［J］. Nature， 2023，615（7953）： 620-627.
53	SOBHANI A， YOUNG W， BAHROLOLOOM S， et al. Calculating time-to-collision for analysing right turning behaviour at signalised intersections［J］. Road & Transport Research： A Journal of Australian and New Zealand Research and Practice， 2013， 22（3）： 49-61.
54	KOLEKAR S， DE WINTER J， ABBINK D. Human-like driving behaviour emerges from a risk-based driver model［J］. Nature Communications， 2020， 11（1）： 1-13.
55	CHEN C， LAN Z， ZHAN G， et al. Podar： modeling driver's perceived risk with situation awareness theory［J］. Available at SSRN 4129030.
56	LI S E， LI K， WANG J. Economy-oriented vehicle adaptive cruise control with coordinating multiple objectives function［J］. Vehicle System Dynamics， 2013， 51（1）： 1-17.
57	LI S E. Reinforcement learning for sequential decision and optimal control［M］. Springe， 2023.
58	MNIH V， KAVUKCUOGLU K， SILVER D， et al. Human-level control through deep reinforcement learning［J］. Nature， 2015， 518（7540）： 529-533.
59	FUJIMOTO S， HOOF H， MEGER D. Addressing function approximation error in actor-critic methods［C］. International Conference on Machine Learning （ICML）， 2018： 1587-1596.
60	SCHULMAN J， WOLSKI F， DHARIWAL P， et al. Proximal policy optimization algorithms［J］. arXiv Preprint arXiv：， 2017.
61	HAARNOJA T， ZHOU A， ABBEEL P， et al. Soft actor-critic： off-policy maximum entropy deep reinforcement learning with a stochastic actor［C］. International Conference on Machine Learning （ICML）， 2018： 1861-1870.
62	DUAN J， GUAN Y， LI S E， et al. Distributional soft actor-critic： off-policy reinforcement learning for addressing value estimation errors［J］. IEEE Transactions on Neural Networks And Learning Systems， 2021， 33（11）： 6584-6598.
63	GUAN Y， DUAN J， LI S E， et al. Mixed policy gradient［J］. arXiv Preprint arXiv：， 2021.
64	MU Y， PENG B， GU Z， et al. Mixed reinforcement learning for efficient policy optimization in stochastic environments［C］. International Conference on Control， Automation and Systems （ICCAS）. IEEE， 2020： 1212-1219.
65	GUAN Y， LI S E， DUAN J， et al. Direct and indirect reinforcement learning［J］. International Journal of Intelligent Systems， 2021， 36（8）： 4439-4467.

[1]	HAO Han, WANG Si-南, LI Xiao, LIU Zong-Wei, ZHAO Fu-Quan. [J]. , 2017, 39(1): 1 -8 .
[2]	XU Cheng-Shan, JIANG Fa-Chao, SONG Sen-Nan, TIAN Guang-Yu. [J]. , 2017, 39(1): 9 -14 .
[3]	JIANG Ting, DAI Bing. [J]. , 2016, 38(12): 1459 -1466 .
[4]	ZHANG Lei, ZHANG Jin-Qiu, LUO Tao, BI Zhan-Dong, YAO Jun. [J]. , 2016, 38(12): 1494 -1499 .
[5]	FENG Chong, ZHANG Dong-Hao, LUO Yu-Gong, LI Ke-Qiang. [J]. , 2017, 39(1): 66 -72 .
[6]	YU Zhuo-Ping, HAN Wei, XIONG Lu. [J]. , 2017, 39(1): 52 -60 .
[7]	CUI Jun-Jia, YUAN Wei, LI Guang-Yao. [J]. , 2017, 39(1): 113 -120 .
[8]	GUO Peng-Cheng, XU Cong-Chang, LIU Zhi-Wen, FANG Xiang-Dong, LI Luo-Xing. [J]. , 2017, 39(2): 138 -144 .
[9]	ZHANG Zhong-Shi, WANG Li-Fang, ZHANG Jun-Zhi. [J]. , 2017, 39(3): 288 -295 .
[10]	LIU Wei-Da, WANG Tie, SHEN Jin-Xian, YI Chen-Yang. [J]. , 2017, 39(3): 323 -327 .

设计思想		模块化		黑箱化
体系架构		分层式架构（HDC）	集成式架构（IDC）	端到端架构（E2E）
技术方案	专家规则型	文献［6-7］
	监督学习型	文献［9，17］		文献［8，26］
	类脑学习型	文献［18-22］	文献［14-16，23-25］	文献［13，27-31］

仿真软件	场景渲染类型	地图自定义	车辆动力学保真度	环境传感器保真度	微观交通流保真度	仿真计算效率	维护单位
TORCS	3D		○	○○		○○○	TORCS team
CARLA	3D		○	○○○	○○	○○	英特尔
Prescan	3D	○	○○○	○○		○	西门子
Apollo	3D	○	○○	○○○		○○○	百度
TADSim	3D	○	○○	○○○	○○	○○	腾讯
Cognata	3D	○	○○	○○○	○○	○○	Cognata
DriverGym	2D	○	○	○	○	○○○	丰田
AirSim	3D		○○	○○○	○○	○○	微软
MetaDrive	3D	○	○○	○○	○	○○	香港中文大学
LasVSim	2D	○	○○	○○	○○○	○○○	清华大学

文献	架构	驾驶任务	仿真软件	状态表征	训练算法	实车验证
Lillicrap等^［13］（2016）	E2E	封闭赛道	TORCS	特征式	DDPG
Chen等^［27］（2019）	E2E	两车道环岛	CARLA	特征式	SAC， TD3
Li等^［29］（2022）	E2E	信控交叉口	MetaDrive	目标式	PPO， SAC
Duan等^［30］（2022）	E2E	多车道	LasVSim	组合式	DSAC	是
Hoel等^［18］（2018）	HDC	换道决策		目标式	DQN
Yurtsever等^［19］（2020）	HDC	运动控制	CARLA	特征式	DQN
Liu等^［21］（2022）	HDC	运动控制		目标式	RMPC
Guan等^［14］（2022）	IDC	交叉路口	LasVSim	目标式	ADP	是
Gu等^［24］（2022）	IDC	高速多车道		组合式	SAC
Ren等^［23］（2022）	IDC	信控交叉口	LasVSim	组合式	ADP

[1]	Shurui Guan,Keqiang Li,Junyu Zhou,Jia Shi,Weiwei Kong,Yugong Luo. A Cooperative Lane Change Strategy for Intelligent Connected Vehicles Oriented to Mandatory Lane Change Scenarios [J]. Automotive Engineering, 2024, 46(2): 201-210.
[2]	Xinke Fu,Yingfeng Cai,Long Chen,Hai Wang,Qingchao Liu. Decision-Making for Autonomous Driving in Uncertain Environment [J]. Automotive Engineering, 2024, 46(2): 211-221.
[3]	Yougang Bian,Tiantian Zhang,Heping Xie,Hongmao Qin,Zeyu Yang. Anti-disturbance and Anti-corner-cutting Control for Collaborative Path Tracking of Vehicle Platoon [J]. Automotive Engineering, 2023, 45(8): 1320-1332.
[4]	Yuxin Guan,Haojie Ji,Zhe Cui,He Li,Liwen Chen. An Overview of Intrusion Detection Methods for In-Vehicle CAN Network of Intelligent Networked Vehicles [J]. Automotive Engineering, 2023, 45(6): 922-935.
[5]	Lisheng Jin,Guangde Han,Xianyi Xie,Baicang Guo,Guofeng Liu,Wentao Zhu. Review of Autonomous Driving Decision-Making Research Based on Reinforcement Learning [J]. Automotive Engineering, 2023, 45(4): 527-540.
[6]	Wenqin Zhong,Weiwei Kong,Zhiheng Li,Jie Yu,Yugong Luo. Reservation Based Multi-Vehicle Cooperative Control at Traffic-Lightless Intersection Under Different Penetration of Mixed Traffic Flow [J]. Automotive Engineering, 2022, 44(8): 1144-1152.
[7]	Yihe Chen,Weiwei Kong,Jie Yu,Keqiang Li,Yugong Luo. Reservation-based Vehicle Platoon Control at Unsignalized Intersections Under Mixed Traffic Condition [J]. Automotive Engineering, 2022, 44(7): 953-959.
[8]	Zhenhai Gao,Xiangtong Yan,Fei Gao. A Decision-making Method for Longitudinal Autonomous Driving Based on Inverse Reinforcement Learning [J]. Automotive Engineering, 2022, 44(7): 969-975.
[9]	Dongjian Song,Bing Zhu,Jian Zhao,Jiayi Han,Yanchen Liu. Human-Like Behavior Decision-Making of Intelligent Vehicles Based on Driving Behavior Generation Mechanism [J]. Automotive Engineering, 2022, 44(12): 1797-1808.
[10]	Qiaobin Liu,Lu Yang,Bolin Gao,Jianqiang Wang,Keqiang Li. Car Following Model for Intelligent Vehicles Based on Dynamic Balance of Perception Risk [J]. Automotive Engineering, 2022, 44(11): 1627-1635.
[11]	Biao Yang,Fucheng Fan,Jicheng Yang,Yingfeng Cai,Hai Wang. Recognition of Pedestrians’ Street⁃crossing Intentions Based on Action Prediction and Environment Context [J]. Automotive Engineering, 2021, 43(7): 1066-1076.
[12]	Zhenhai Gao,Xiangtong Yan,Fei Gao,Tianjun Sun. A Driver-like Decision-making Method for Longitudinal Autonomous Driving Based on DDPG [J]. Automotive Engineering, 2021, 43(12): 1737-1744.
[13]	Xiong Ying, Mao Xuesong. Research on the Method of Navigating Autonomous Driving Vehicle Through Expressway Toll Region [J]. Automotive Engineering, 2020, 42(9): 1263-1269.
[14]	Chen Wuwei, Wang Qidong, Ding Yukang, Zhao Linfeng, Wang Huiran, Xie Youhao. Weight Allocation Strategy Between Human and MachineBased on the Preview Distance to Lane Center [J]. Automotive Engineering, 2020, 42(4): 513-521.
[15]	Yang Mingliang, Kou Shengjie, Lu Yong, Yu Chunlei, Jiang Kun & Yang Diange. Research on Lane Change Warning Model Based on Multi-sensor Fusion [J]. Automotive Engineering, 2019, 41(10): 1197-1203.

Key Technologies of Brain-Inspired Decision and Control Intelligence for Autonomous Driving Systems

RichHTML

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 16

References 65

Related Articles 15

Metrics

Comments

Recommended 10