汽车工程 ›› 2024, Vol. 46 ›› Issue (5): 776-783.doi: 10.19562/j.chinasae.qcgc.2024.05.004

• • 上一篇    

动态场景下基于3D多目标追踪的实时视觉SLAM方法研究

陈吉清1,2,车宇翔1,2,田小强1,2,兰凤崇1,2,周云郊1,2()   

  1. 1.华南理工大学机械与汽车工程学院,广州 510640
    2.华南理工大学,广东省汽车工程重点实验室,广州 510640
  • 收稿日期:2023-11-12 修回日期:2023-12-26 出版日期:2024-05-25 发布日期:2024-05-17
  • 通讯作者: 周云郊 E-mail:mezhouyj@scut.edu.cn
  • 基金资助:
    国家自然科学基金(52175267);广东省自然科学基金(2021A1515010912);国家车辆事故深度调查体系(NAIS)和新能源汽车事故调查协作网资助

Research on Real-Time Visual SLAM Method Based on 3D Multi-Object Tracking in Dynamic Scenes

Jiqing Chen1,2,Yuxiang Che1,2,Xiaoqiang Tian1,2,Fengchong Lan1,2,Yunjiao Zhou1,2()   

  1. 1.School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou 510640
    2.South China University of Technology, Guangdong Provincial Key Laboratory of Automotive Engineering, Guangzhou 510640
  • Received:2023-11-12 Revised:2023-12-26 Online:2024-05-25 Published:2024-05-17
  • Contact: Yunjiao Zhou E-mail:mezhouyj@scut.edu.cn

摘要:

近年来一些解决动态场景下的SLAM技术被提出,其中SLAM与MOT结合的技术路线不仅可解决动态场景问题,还可以提高系统对周围场景的理解,获得了更大关注。本文介绍了一种高效的实时在线视觉SLAMMOT融合系统,以双目视觉或RGBD作为输入,只须借助2D目标检测网络,便能高效、准确、鲁棒地跟踪相机以及动态目标的位姿,并生成稀疏点云地图。为提高多动态目标追踪的精度与准确度,引入了级联匹配与IOU匹配结合的策略;利用阿克曼转向模型来简化追踪目标的运动,减少求解动态目标位姿所需匹配点的数量;利用因子图将相机与动态目标的追踪结果进行联合优化,同时提高相机、追踪目标的位姿和地图点的精度。最后在KITTI跟踪数据集上与其他方法进行比较。结果表明,在满足实时性要求的前提下,该方法仍能准确地追踪相机以及动态目标位姿。

关键词: 视觉SLAM, 动态场景, 多目标追踪, 实时系统

Abstract:

In recent years, some technologies have emerged to tackle the challenges of Simultaneous Localization and Mapping (SLAM) in dynamic scenes, among which the integration of SLAM and moving object tracking (MOT) has gained significant attention as it not only tackles the problem of dynamic scenes but also enhances the system's understanding of the surrounding environment. In this paper, an efficient real-time online visual SLAM-MOT fusion system is introduced in this paper, which takes binocular vision or RGBD as input. With the help of a 2D object detection network, this approach can track the camera and dynamic object poses efficiently, accurately and robustly, while generating a sparse point cloud map. Additionally, to improve the precision and accuracy of multi-dynamic object tracking, a strategy combining the cascaded matching and IOU matching strategy is introduced. The Ackermann steering model is used to simplify the motion estimation of the tracked objects to reduce number of matching points required to solve dynamic target pose. By employing a factor graph, the tracking results of both the camera and dynamic objects are jointly optimized, simultaneously enhancing the accuracy of the camera, object poses, and map points. Finally, the proposed method is compared with other approaches using the KITTI tracking dataset. The results show that, while satisfying real-time requirements, this method can still achieve accurate camera and dynamic object pose tracking.

Key words: visual SLAM, dynamic environment, multiple objects tracking, real-time system