基于逆模型预测控制的拟人驾驶控制

doi:10.19562/j.chinasae.qcgc.2024.04.005

摘要/Abstract

摘要：

本文提出一种基于逆模型预测控制的拟人驾驶控制方法，利用模型预测控制产生的实轴轨迹与真实轨迹的损失函数更新控制模块代价函数的权重系数实现拟人化驾驶控制。将拟人驾驶控制构建成一个双层优化问题，在下层利用模型预测控制求解一个典型的最优控制问题产生实轴驾驶轨迹，在上层最小化所产生的实轴轨迹和真实驾驶轨迹的误差更新下层代价函数的权重系数，基于极大值微分原理构造辅助系统求解实轴轨迹关于代价函数权重系数的梯度。实车采集真实驾驶轨迹并进行模仿测试与泛化验证，结果表明：本文所提出的方法相比于两类基于虚轴轨迹的逆最优控制方法，在3个工况下与真实驾驶轨迹最大误差分别平均降低了73.52%和65.03%，驾驶行为更加拟人化，且具备泛化性能。

关键词: 自动驾驶, 拟人化驾驶, 逆最优控制

Abstract:

In this paper， a human-like driving control based on inverse model predictive control is proposed， which realizes human-like driving by updating the weight coefficients of the cost function of the control module using the loss function of the real-time trajectory generated by the model predictive control and the driver's trajectory. The human-like driving control is constructed as a two-layer optimization problem. In the lower layer， real-time state trajectories are generated by solving a typical optimal control problem using model predictive control. The optimization objective function of the lower layer is then updated by minimizing the error between the generated real-time trajectories and those of human drivers in the upper layer. The auxiliary systems based on the differential Pontryagin's Maximum Principle are constructed to solve the gradient of the weight coefficients of the cost function for the real axis trajectory. The driver's driving data are collected from the real vehicle， imitated， and tested. The results show that the method proposed in this paper， compared with two types of inverse optimal control methods based on the virtual-time trajectory， reduces the maximum error with the real trajectory by 73.52% and 65.03% in the three test conditions， with the driving behavior more anthropomorphic and has the generalization performance.

Key words: self-driving, human-like driving, inverse optimal control

刘辉,张发旺,聂士达,段京良,郭丛帅,郭凌雄. 基于逆模型预测控制的拟人驾驶控制[J]. 汽车工程, 2024, 46(4): 596-604.

Hui Liu,Fawang Zhang,Shida Nie,Jingliang Duan,Congshuai Guo,Lingxiong Guo. Human-Like Driving Control Based on Inverse Model Predictive Control[J]. Automotive Engineering, 2024, 46(4): 596-604.

图/表 14

表1

车辆参数表"

符号/单位	释义	数值
$I z$ /（kg $· m 2$ ）	转动惯量	1 536.7
$k f$ /（N $· r a d - 1$ ）	前轴侧偏刚度	-128 916
$k r$ /（N $· r a d - 1)$	后轴侧偏刚度	-85 944
$l f$ /m	质心到前轴距离	1.06
$l r$ /m	质心到后轴距离	1.85
$m$ /kg	整车质量	2 000

表1

图1

图2

图3

表2

算法伪代码"

算法 IMPC
1：	初始化权重系数 $θ$ 和学习率 $η θ$ ，给定真实驾驶轨迹 $ξ d = x 0 : M d$
2：	repeat
3：	前向过程：
4：	以式（3）产生长度为 $M$ 的实轴轨迹 $ξ θ$
5：	以式（5）计算上层损失
6：	反向过程：
7：	For $t = 1 : M$ ：
8：	通过式（12）计算 $? u t - 1 θ ? θ$
9：	通过式（14）计算 $? u t - 1 θ ? x t - 1 θ$
10：	通过式（8）计算 $d x t θ d θ$
11：	结束For循环
12：	通过式（7）计算 $d L d θ$ ，并通过式（6）更新 $θ$
13：	直到收敛

表2

图4

图5

表3

训练参数表"

符号	释义	数值
$η θ$	学习率	5
$N$	预测时域	20
$M$	实轴轨迹步数	100
$K$	参数迭代次数	10 000

表3

图6

表4

直行工况与真实驾驶轨迹最大绝对误差"

状态	Init	PDP	IKKT	IMPC
横向位置/m	0.41	1.08	0.58	0.13
纵向速度/ $(m · s - 1)$	0.56	0.39	0.45	0.10
横向速度/ $(m · s - 1)$	0.97	0.29	0.22	0.05
横摆角/ $r a d$	0.29	0.17	0.07	0.03
横摆角速度/ $(r a d · s - 1)$	0.65	0.19	0.11	0.03
前轮转角/ $r a d$	0.36	0.11	0.18	0.01
纵向加速度/ $(m · s - 2)$	0.96	0.44	0.40	0.40

表4

图7

表5

直角转弯与真实驾驶轨迹最大绝对误差"

状态	Init	PDP	IKKT	IMPC
横向位置/m	1.42	0.99	0.65	0.14
纵向速度/ $(m · s - 1)$	0.6	0.31	0.58	0.17
横向速度/ $(m · s - 1)$	0.71	0.21	0.19	0.03
横摆角/ $r a d$	0.22	0.12	0.07	0.02
横摆角速度/ $(r a d · s - 1)$	0.46	0.14	0.11	0.02
前轮转角/ $r a d$	0.32	0.11	0.16	0.01
纵向加速度/ $(m · s - 2)$	0.87	0.29	0.26	0.17

表5

图8

表6

U型转弯与真实驾驶轨迹最大绝对误差"

状态	Init	PDP	IKKT	IMPC
横向位置/m	1.53	1.06	0.69	0.19
纵向速度/ $(m · s - 1)$	1.05	0.88	0.80	0.20
横向速度/ $(m · s - 1)$	1.52	1.14	0.98	0.57
横摆角/ $r a d$	2.18	1.53	1.29	0.45
横摆角速度/ $(r a d · s - 1)$	0.96	0.75	0.65	0.38
前轮转角/ $r a d$	0.72	0.44	0.37	0.24
纵向加速度/（ $m · s - 2)$	0.99	0.97	1.05	0.39

表6

参考文献 31

1	XU D， DING Z， HE X， et al. Learning from naturalistic driving data for human-like autonomous highway driving［J］. IEEE Transactions on Intelligent Transportation Systems， 2020， 22（12）： 7341-7354.
2	EMUNA R， BOROWSKY A， BIESS A. Deep reinforcement learning for human-like driving policies in collision avoidance tasks of self-driving cars［J］. arXiv.2006.04218.2020.
3	HECKER S， DAI D， VAN GOOL L. Learning accurate， comfortable and human-like driving［J］. arXiv.1903.10995.2019.
4	SAMA K， MORALES Y， LIU H， et al. Extracting human-like driving behaviors from expert driver data using deep learning［J］. IEEE Transactions on Vehicular Technology， 2020， 69（9）： 9315-9329.
5	LIANG Y， LI Y， YU Y， et al. Path-following control of autonomous vehicles considering coupling effects and multi-source system uncertainties［J］. Automotive Innovation， 2021， 4（3）： 284-300.
6	GUO N， ZHANG X， ZOU Y. Real-time predictive control of path following to stabilize autonomous electric vehicles under extreme drive conditions［J］. Automotive Innovation， 2022， 5（4）： 453-470.
7	LI S E. Reinforcement learning for sequential decision and optimal control［M］. Springer， 2023.
8	WANG W， ZHANG Y， GAO J， et al. GOPS： a general optimal control problem solver for autonomous driving and industrial control applications［J］. Communications in Transportation Research， 2023， 3： 100096.
9	HECKER S， DAI D， LINIGER A， et al. Learning accurate and human-like driving using semantic maps and attention［C］. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems （IROS）. IEEE， 2020： 2346-2353.
10	CODEVILLA F， SANTANA E， LÓPEZ A M， et al. Exploring the limitations of behavior cloning for autonomous driving［C］. Proceedings of the IEEE/CVF International Conference on Computer Vision， 2019： 9329-9338.
11	GU T， DOLAN J M. Toward human-like motion planning in urban environments［C］. 2014 IEEE Intelligent Vehicles Symposium Proceedings. IEEE， 2014： 350-355.
12	HANG P， LV C， XING Y， et al. Human-like decision making for autonomous driving： a noncooperative game theoretic approach［J］. IEEE Transactions on Intelligent Transportation Systems， 2020， 22（4）： 2076-2087.
13	SAMA K， MORALES Y， LIU H， et al. Extracting human-like driving behaviors from expert driver data using deep learning［J］. IEEE Transactions on Vehicular Technology， 2020， 69（9）： 9315-9329.
14	宋晓琳，盛鑫，曹昊天，等.基于模仿学习和强化学习的智能车辆换道行为决策［J］. 汽车工程， 2021， 43（1）： 59-67.
	SONG X L， SHENG X， CAO H T， et al. Lane‑change behavior decision‑making of intelligent vehicle based on imitation learning and reinforcement learning［J］. Automotive Engineering， 2021， 43（1）： 59-67.
15	XU Y， GAO W， HSU D. Receding horizon inverse reinforcement learning［J］. Advances in Neural Information Processing Systems， 2022， 35： 27880-27892.
16	LIU J， BOYLE L N， BANERJEE A G. An inverse reinforcement learning approach for customizing automated lane change systems［J］. IEEE Transactions on Vehicular Technology， 2022， 71（9）： 9261-9271.
17	HO J， ERMON S. Generative adversarial imitation learning［J］. Advances in Neural Information Processing Systems， 2016， 29.
18	KUDERER M， GULATI S， BURGARD W. Learning driving styles for autonomous vehicles from demonstration［C］. 2015 IEEE International Conference on Robotics and Automation （ICRA）， 2015.
19	KALAKRISHNAN M， PASTOR P， RIGHETTI L， et al. Learning objective functions for manipulation［C］. 2013 IEEE International Conference on Robotics and Automation. IEEE， 2013： 1331-1336.
20	LEVINE S， KOLTUN V. Continuous inverse optimal control with locally optimal examples［J］. arXiv， 2012.
21	ENGLERT P， VIEN N A， TOUSSAINT M. Inverse KKT： learning cost functions of manipulation tasks from demonstrations：［J］. The International Journal of Robotics Research， 2017： 1474-1488.
22	JIN W， KULIC D， LIN J F-S， et al. Inverse optimal control for multiphase cost functions［J］. IEEE Transactions on Robotics， 2019： 1387-1398.
23	JIN W， KULIC D， MOU S， et al. Inverse optimal control from incomplete trajectory observations［J］. The International Journal of Robotics Research， 2018.
24	LIANG Z， JIN W， MOU S. An iterative method for inverse optimal control［C］. 2022 13th Asian Control Conference （ASCC）， 2022.
25	JIN W， WANG Z， YANG Z， et al. Pontryagin differentiable programming： an end-to-end learning and control framework［J］. Advances in Neural Information Processing Systems， 2020， 33： 7979-7992.
26	GUO L， JIA Y. Inverse model predictive control （IMPC） based modeling and prediction of human-driven vehicles in mixed traffic［J］. IEEE Transactions on Intelligent Vehicles， 2020， 6（3）： 501-512.
27	GE Q， SUN Q， LI S， et al. Numerically stable dynamic bicycle model for discrete-time control［C］. 2021 IEEE Intelligent Vehicles Symposium Workshops （IV Workshops）. IEEE， 2021： 128-134.
28	LI S， LI K， RAJAMANI R， et al. Model predictive multi-objective vehicular adaptive cruise control［J］. IEEE Transactions on Control Systems Technology， 2011， 19（3）：556-566.
29	LI S， JIA Z， LI K， et al. Fast online computation of a model predictive controller and its application to fuel economy-oriented adaptive cruise control［J］. IEEE Transactions on Intelligent Transportation Systems， 2015， 16（3）：1199-1209.
30	赵菲，王建，张天雷，等. 云控场景下车辆队列的模型预测控制方法［J］. 汽车工程， 2022， 44（2）： 179-189.
	ZHAO F， WANG J， ZHANG T L， et al. Model predictive control method for vehicle platoon under cloud control scenes［J］. Automotive Engineering， 2022， 44（2）： 179-189.
31	王明，唐小林，杨凯，等. 考虑预测风险的自动驾驶车辆运动规划方法［J］. 汽车工程， 2023， 45（8）： 1362-1372.
	WANG M， TANG X L， YANG K， et al. A motion planning method for autonomous vehicles considering prediction risk［J］. Automotive Engineering， 2023， 45（8）： 1362-1372.

[1]	杜国栋,邹渊,张旭东,孙文景,孙巍. 基于双估计强化学习结合前向预测控制的自动驾驶运动控制研究[J]. 汽车工程, 2024, 46(4): 564-576.
[2]	丁志杰,王亚飞,章翼辰,邬明宇,王亦乐. 基于复合动态采样的自动驾驶矿车节能路径规划方法[J]. 汽车工程, 2024, 46(4): 588-595.
[3]	张培兴,秦孔建,朱冰,赵健,范天昕,赵文博. 自动驾驶仿真多逻辑场景综合评价方法[J]. 汽车工程, 2024, 46(3): 375-382.
[4]	周亦威,夏莫,朱冰. 城市道路场景下考虑多类交通参与者的多模态车辆轨迹预测方法研究[J]. 汽车工程, 2024, 46(3): 396-406.
[5]	付新科,蔡英凤,陈龙,王海,刘擎超. 不确定性环境下的自动驾驶汽车行为决策方法[J]. 汽车工程, 2024, 46(2): 211-221.
[6]	赵晓聪,房世玉,李子睿,孙剑. 社会性驾驶交互关键效用析取与应用[J]. 汽车工程, 2024, 46(2): 230-240.
[7]	马艳丽, 秦钦, 董方琦, 娄艺苧. 基于风险场的不同认知次任务下接管风险评估模型[J]. 汽车工程, 2024, 46(1): 9-17.
[8]	刘卫国,项志宇,刘伟平,齐道新,王子旭. 基于分布式强化学习的车辆控制算法研究[J]. 汽车工程, 2023, 45(9): 1637-1645.
[9]	王明,唐小林,杨凯,李国法,胡晓松. 考虑预测风险的自动驾驶车辆运动规划方法[J]. 汽车工程, 2023, 45(8): 1362-1372.
[10]	朱向雷,吴志新,张宇飞,赵帅,李克秋,孙博华. 基于场景降维及采样方法的场景库优化方法研究[J]. 汽车工程, 2023, 45(8): 1408-1416.
[11]	吴新政,邢星宇,刘力豪,沈勇,陈君毅. 基于错误注入的决策规划系统抗扰性测试与分析[J]. 汽车工程, 2023, 45(8): 1428-1437.
[12]	高锋,冯德福,胡秋霞. 面向NMPC运动规划系统的数值优化加速技术[J]. 汽车工程, 2023, 45(8): 1438-1447.
[13]	芦涛,金馨,廖毅霏,黄圣杰,杨依琳,谢国涛,秦晓辉. 基于雅克比域零空间边缘化的视觉SLAM[J]. 汽车工程, 2023, 45(8): 1457-1467.
[14]	伍文广,田双岳,张志勇,张斌. 非铺装道路凹凸不平特征语义分割方法研究[J]. 汽车工程, 2023, 45(8): 1468-1478.
[15]	林程, 汪博文, 吕沛原, 宫新乐, 于潇. 面向变曲率道路的自动驾驶汽车换道博弈运动规划与协同控制研究[J]. 汽车工程, 2023, 45(7): 1099-1111.