汽车工程 ›› 2025, Vol. 47 ›› Issue (11): 2070-2082.doi: 10.19562/j.chinasae.qcgc.2025.11.002

• • 上一篇    

飞行汽车多模态任务路径高效学习规划方法

赵靖1,2,杨超1,2(),王伟达1,2,李颖1,2,项昌乐1,2   

  1. 1.北京理工大学机械与车辆学院,北京 100081
    2.北京理工合肥无人智能装备研究院,合肥 230041
  • 收稿日期:2025-04-11 修回日期:2025-05-31 出版日期:2025-11-25 发布日期:2025-11-28
  • 通讯作者: 杨超 E-mail:cyang@bit.edu.cn
  • 基金资助:
    国家自然科学基金(524B2155)

An Efficient Learning Method for Multi-Modal Task Path Planning of Flying Vehicles

Jing Zhao1,2,Chao Yang1,2(),Weida Wang1,2,Ying Li1,2,Changle Xiang1,2   

  1. 1.School of Mechanical Engineering,Beijing Institute of Technology,Beijing 100081
    2.Hefei Unmanned Intelligent Equipment Research Institute,Beijing Institute of Technology,Hefei 230041
  • Received:2025-04-11 Revised:2025-05-31 Online:2025-11-25 Published:2025-11-28
  • Contact: Chao Yang E-mail:cyang@bit.edu.cn

摘要:

飞行汽车在城市交通、救援运输等领域备受关注。高效的多模态任务路径规划有效提高其在上述领域中的作业效率。为此,本文提出一种用于飞行汽车多模态任务路径规划的高效学习方法。首先,优化了飞行汽车的动作空间,保留起飞、降落及朝向目标位置方向的动作,同时设计了一种针对非目标方向动作的概率选择机制。其次,考虑飞行汽车的空地协同特点,设计一种新型的Q-learning奖励函数,并提出一种针对历史最优路径经验的奖励增强机制。最后,提出一种路径平滑方法,获得一条光滑连续的空地协同任务路径。研究结果表明:所提方法规划的多模态路径依次比A*、Q-learning和D* Lite所规划的多模态路径减少了10.35、126.75和162.10 m的运行距离。在学习效率方面,所提方法比Q-learning缩短了45.97%的学习时间。

关键词: 飞行汽车, 多模态任务路径规划, 动作空间, 奖励函数, 路径平滑

Abstract:

Flying vehicles have attracted significant attention in urban traffic, rescue transportation, and other operational fields. Efficient multi-modal task path planning effectively improves their operational efficiency in these fields. Therefore, an efficient learning method for multi-modal task path planning of flying vehicles is proposed. Firstly, the action space of the flying vehicle is optimized, retaining the actions of take-off, landing, and moving towards the target position. Simultaneously, a probability selection mechanism for non-target direction actions is designed. Secondly, considering the air-ground coordination characteristics of the flying vehicle, a novel reward function of Q-learning is designed. And a reward enhancement mechanism based on historical optimal path experience is proposed. Finally, a path smoothing method is proposed to obtain a smooth and continuous path for the air-ground cooperative task. Compared with the multi-modal paths planned by A*, Q-learning, and D* Lite, the multi-modal path planned by this method successively reduces the running distance by 10.35, 126.75, and 162.10 m, respectively. In terms of learning efficiency, the method reduces the learning time by 45.97% compared to Q-learning.

Key words: flying vehicles, multi-modal task path planning, action space, reward function, path smoothing