汽车工程 ›› 2024, Vol. 46 ›› Issue (6): 945-955.doi: 10.19562/j.chinasae.qcgc.2024.06.001
• • 下一篇
肖礼明1,张发旺2,陈良发1,闫昊琪1,马飞1,李升波3,段京良1()
收稿日期:
2023-12-13
修回日期:
2024-01-12
出版日期:
2024-06-25
发布日期:
2024-06-19
通讯作者:
段京良
E-mail:duanjl@ustb.edu.cn
基金资助:
Liming Xiao1,Fawang Zhang2,Liangfa Chen1,Haoqi Yan1,Fei Ma1,Shengbo Eben Li3,Jingliang Duan1()
Received:
2023-12-13
Revised:
2024-01-12
Online:
2024-06-25
Published:
2024-06-19
Contact:
Jingliang Duan
E-mail:duanjl@ustb.edu.cn
摘要:
轨迹跟踪避撞是车辆智能性的重要体现,针对现有控制方法面对同一场景的控制风格单一问题,本文中提出了一种多风格型强化学习控制方法。为实现控制风格多样性,首次将风格指标引入值网络和策略网络,搭建了多风格跟踪避撞策略网络,并结合值分布强化学习理论构建了多风格策略迭代框架,依托该框架推导提出了多风格值分布强化学习算法。仿真和实车试验表明:所提出方法可以多种驾驶风格(激进、中性、保守)完成轨迹跟踪避撞任务,实车稳态轨迹跟踪误差小于5 cm,具备较高的控制精度,实车平均单步决策耗时仅为6.07 ms,满足实时性要求。
肖礼明,张发旺,陈良发,闫昊琪,马飞,李升波,段京良. 依托多风格强化学习的车辆轨迹跟踪避撞控制[J]. 汽车工程, 2024, 46(6): 945-955.
Liming Xiao,Fawang Zhang,Liangfa Chen,Haoqi Yan,Fei Ma,Shengbo Eben Li,Jingliang Duan. Vehicle Trajectory Tracking and Collision Avoidance Control Based on Multi-style Reinforcement Learning[J]. Automotive Engineering, 2024, 46(6): 945-955.
1 | 李升波, 关阳, 侯廉, 等. 深度神经网络的关键技术及其在自动驾驶领域的应用[J]. 汽车安全与节能学报, 2019, 10(2): 119-145. |
LI S E, GUAN Y, HOU L, et al. Key technology of deep neural network and its application in the field of autonomous driving[J]. Journal of Automotive Safety and Energy Conservation, 2019, 10(2): 119-145. | |
2 | ZHANG P, ZHU B, ZHAO J, et al. Performance evaluation method for automated driving system in logical scenario[J]. Automotive Innovation, 2022, 5(3): 299-310. |
3 | 李道飞,查安飞,徐彪, 等. 半挂汽车列车紧急避撞轨迹跟踪控制算法 [J]. 汽车工程, 2022, 44 (7): 1098-1106. |
LI D F, CHA A F, XU B, et al. Trajectory tracking control algorithm for emergency collision avoidance of semi-trailer automobile train [J]. Automotive Engineering, 2022, 44 (7): 1098-1106. | |
4 | 李升波, 占国建, 蒋宇轩,等. 类脑学习型自动驾驶决控系统的关键技术[J]. 汽车工程, 2023,45(9): 1499-1515. |
LI S E, ZHAN G J, JIANG Y X, et al. Key technologies of brain-inspired decision and control intelligence for autonomous driving systems[J]. Automotive Engineering, 2023,45(9): 1499-1515. | |
5 | GUAN Y, TANG L, LI C, et al. Integrated decision and control for high-level automated vehicles by mixed policy gradient and its experiment verification[J]. arXiv preprint arXiv:, 2022. |
6 | LI G, ZHOU W, LIN S, et al. On-ramp merging for highway autonomous driving: an application of a new safety indicator in deep reinforcement learning[J]. Automotive Innovation, 2023, 6(3): 453-465. |
7 | 王建, 许叁征, 甘浩, 等. 智能汽车纵深防御关键技术及挑战 [C]. 2018 中国汽车工程学会年会论文集, 2018: 287291. |
WANG J, XU S Z, GAN H, et al. Key technologies and challenges of intelligent vehicle in-depth defense [C]. 2018 SAE-China Annual Conference Proceedings, 2018: 287-291. | |
8 | LIU Z, ZHANG W, ZHAO F. Impact, challenges and prospect of software-defined vehicles[J]. Automotive Innovation, 2022, 5(2): 180-194. |
9 | WANG Y, CAO X, HU Y. A trajectory planning method of automatic lane change based on dynamic safety domain[J]. Automotive Innovation, 2023, 6(3): 466-480. |
10 | LIANG Y, LI Y, YU Y, et al. Path-following control of autonomous vehicles considering coupling effects and multi-source system uncertainties[J]. Automotive Innovation, 2021, 4(3): 284-300. |
11 | GUO N, ZHANG X, ZOU Y. Real-time predictive control of path following to stabilize autonomous electric vehicles under extreme drive conditions[J]. Automotive Innovation, 2022, 5(4): 453-470. |
12 | GE Q, SARTORETTI G, DUAN J, et al. Distributed model predictive control of connected multi-vehicle systems at unsignalized intersections[C]. 2022 IEEE International Conference on Unmanned Systems (ICUS). IEEE, 2022: 1466-1472. |
13 | 王宏伟, 刘晨宇, 李磊, 等. 基于高效NMPC算法的无人车轨迹跟踪控制研究[J]. 汽车工程, 2022, 44(10): 1494-1502. |
WANG H W, LIU C Y, LI L, et al. Research on unmanned vehicle trajectory tracking control based on efficient NMPC algorithm [J]. Automotive Engineering, 2022, 44(10): 1494-1502. | |
14 | LI G, ZHANG X, GUO H, et al. Real-time optimal trajectory planning for autonomous driving with collision avoidance using convex optimization[J]. Automotive Innovation, 2023: 1-11. |
15 | LIU Z, DUAN J, WANG W, et al. Recurrent model predictive control: learning an explicit recurrent controller for nonlinear systems[J]. IEEE Transactions on Industrial Electronics, 2022, 69(10): 10437-10446. |
16 | DUAN J, LI J, GE Q, et al. Relaxed actor-critic with convergence guarantees for continuous-time optimal control of nonlinear systems[J]. IEEE Transactions on Intelligent Vehicles, 2023,8(5): 3299-3311. |
17 | YIN Y, LI S E, TANG K, et al. Approximate optimal filter design for vehicle system through actor-critic reinforcement learning[J]. Automotive Innovation, 2022, 5(4): 415-426. |
18 | HE X, LV C. Towards safe autonomous driving: decision making with observation-robust reinforcement learning[J]. Automotive Innovation, 2023: 1-12. |
19 | LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[C]. The 4th International Conference on Learning Representations (ICLR). San Juan, Puerto Rico: ICLR, 2016. |
20 | HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actorcritic: offpolicy maximum entropy deep reinforcement learning with a stochastic actor[C]. Proceedings of the 35th International Conference on Machine Learning (ICML). Stockholmsmässan, Sweden: PMLR, 2018: 18611870. |
21 | DUAN J, GUAN Y, LI S E, et al. Distributional soft actor-critic: off-policy reinforcement learning for addressing value estimation errors[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 33(11): 6584-6598. |
22 | PENG B, DUAN J, CHEN J, et al. Model-based chance-constrained reinforcement learning via separated proportional-integral lagrangian[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022. |
23 | KENDALL A, HAWKE J, JANZ D, et al. Learning to drive in a day[C]. 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019: 8248-8254. |
24 | QUANTE L, ZHANG M, PREUK K, et al. Human performance in critical scenarios as a benchmark for highly automated vehicles[J]. Automotive Innovation, 2021, 4(3): 274-283. |
25 | 孙冬鸣, 周培栋, 宋雪松, 等. 引入驾驶风格系数的跟车控制优化策略[J]. 汽车与新动力, 2023, 6(3): 7-11. |
SUN D M, ZHOU P D, SONG X S, et al. Optimization strategy of vehicle control introducing driving style coefficient[J]. Automotive & New Power, 2023, 6(3): 7-11. | |
26 | 汪选要, 魏星, 谢东, 等. 基于权值惩罚法自适应人机协同避撞策略[J]. 科学技术与工程, 2022, 22(13): 5463-5471. |
WANG X Y, WEI X, XIE D, et al. Adaptive human-machine cooperative collision avoidance strategy based on weight penalty method[J]. Science Technology and Engineering, 2022, 22(13): 5463-5471. | |
27 | LU H, LU C, YU Y, et al. Autonomous overtaking for intelligent vehicles considering social preference based on hierarchical reinforcement learning[J]. Automotive Innovation, 2022, 5(2): 195-208. |
28 | GAO B, CAI K, QU T, et al. Personalized adaptive cruise control based on online driving style recognition technology and model predictive control[J]. IEEE Transactions on Vehicular Technology, 2020, 69(11): 12482-12496. |
29 | LI Z, WU C, TAO P, et al. DP and DS-LCD: a new lane change decision model coupling driver’s psychology and driving style[J]. IEEE Access, 2020, 8: 132614-132624. |
30 | REN Y, DUAN J, LI S E, et al. Improving generalization of reinforcement learning with minimax distributional soft actor-critic[C]. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2020: 1-6. |
31 | WURMAN P R, BARRETT S, KAWAMOTO K, et al. Outracing champion Gran Turismo drivers with deep reinforcement learning[J]. Nature, 2022, 602(7896): 223-228. |
32 | DABNEY W, ROWLAND M, BELLEMARE M, et al. Distributional reinforcement learning with quantile regression[C]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1). |
33 | YANG Q, SIMÃO T D, TINDEMANS S H, et al. Safety-constrained reinforcement learning with a distributional safety critic[J]. Machine Learning, 2023, 112(3): 859-887. |
34 | LI S E. Reinforcement learning for sequential decision and optimal control[M]. Springer, 2023. |
35 | WANG W, ZHANG Y, GAO J, et al. GOPS: a general optimal control problem solver for autonomous driving and industrial control applications[J]. Communications in Transportation Research, 2023, 3: 100096. |
36 | FUJIMOTO S, VAN HOOF H, MEGER D. Addressing function approximation error in actorcritic methods[C]. Proceedings of the 35th International Conference on Machine Learning (ICML). Stockholmsmässan, Sweden: PMLR, 2018: 1587-1596. |
[1] | 赵树恩,王盛,冷姚. 智能汽车轨迹跟踪多目标显式模型预测控制[J]. 汽车工程, 2024, 46(5): 784-794. |
[2] | 查云飞,吕小龙,陈慧勤,易迎春,王燕燕. 基于路面附着系数估计的车辆轨迹跟踪控制[J]. 汽车工程, 2023, 45(6): 1010-1021. |
[3] | 张钰,徐明帆,白光宇,董明明,高利,秦也辰. 考虑稳定性约束的智能车辆切换控制方法[J]. 汽车工程, 2023, 45(5): 709-718. |
[4] | 张彬,邹渊,张旭东,孙逢春,吴喆,孟逸豪. 混动履带式无人平台轨迹跟踪控制研究[J]. 汽车工程, 2023, 45(4): 579-587. |
[5] | 李琴,汤建明,张博远,陈勇,王勇. 分布式驱动电动汽车多执行器容错控制研究[J]. 汽车工程, 2023, 45(12): 2251-2259. |
[6] | 李道飞,查安飞,徐彪,张家杰. 半挂汽车列车紧急避撞轨迹跟踪控制算法[J]. 汽车工程, 2022, 44(7): 1098-1106. |
[7] | 方培俊,蔡英凤,陈龙,廉玉波,王海,钟益林,孙晓强. 基于车辆动力学混合模型的智能汽车轨迹跟踪控制方法[J]. 汽车工程, 2022, 44(10): 1469-1483. |
[8] | 卢少波,谢菲菲,张博涵,陆嘉峰,李彩霞. 基于非对称势场的人车协同博弈避撞[J]. 汽车工程, 2022, 44(10): 1484-1493. |
[9] | 王宏伟,刘晨宇,李磊,张昊天. 基于高效NMPC算法的无人车轨迹跟踪控制研究[J]. 汽车工程, 2022, 44(10): 1494-1502. |
[10] | 王安杰,郑玲,李以农,王戡. 基于预测风险场的智能汽车主动避撞运动规划[J]. 汽车工程, 2021, 43(7): 1096-1104. |
[11] | 牛国臣,李文帅,魏洪旭. 基于双五次多项式的智能汽车换道轨迹规划[J]. 汽车工程, 2021, 43(7): 978-986. |
[12] | 张志勇,龙凯,杜荣华,黄彩霞. 自动驾驶汽车高速超车轨迹跟踪协调控制[J]. 汽车工程, 2021, 43(7): 995-1004. |
[13] | 陈龙,邹凯,蔡英凤,滕成龙,孙晓强,王海. 基于NMPC的智能汽车纵横向综合轨迹跟踪控制[J]. 汽车工程, 2021, 43(2): 153-161. |
[14] | 陈龙, 解云鹏, 蔡英凤, 孙晓强, 滕成龙, 邹凯. 极限工况下无人驾驶车辆稳定跟踪控制*[J]. 汽车工程, 2020, 42(8): 1016-1026. |
[15] | 张雷, 赵宪华, 王震坡. 四轮轮毂电机独立驱动电动汽车轨迹跟踪与横摆稳定性协调控制研究*[J]. 汽车工程, 2020, 42(11): 1513-1521. |
|