Administrator by China Associction for Science and Technology
Sponsored by China Society of Automotive Engineers
Published by AUTO FAN Magazine Co. Ltd.

Automotive Engineering ›› 2024, Vol. 46 ›› Issue (5): 776-783.doi: 10.19562/j.chinasae.qcgc.2024.05.004

Previous Articles    

Research on Real-Time Visual SLAM Method Based on 3D Multi-Object Tracking in Dynamic Scenes

Jiqing Chen1,2,Yuxiang Che1,2,Xiaoqiang Tian1,2,Fengchong Lan1,2,Yunjiao Zhou1,2()   

  1. 1.School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou 510640
    2.South China University of Technology, Guangdong Provincial Key Laboratory of Automotive Engineering, Guangzhou 510640
  • Received:2023-11-12 Revised:2023-12-26 Online:2024-05-25 Published:2024-05-17
  • Contact: Yunjiao Zhou E-mail:mezhouyj@scut.edu.cn

Abstract:

In recent years, some technologies have emerged to tackle the challenges of Simultaneous Localization and Mapping (SLAM) in dynamic scenes, among which the integration of SLAM and moving object tracking (MOT) has gained significant attention as it not only tackles the problem of dynamic scenes but also enhances the system's understanding of the surrounding environment. In this paper, an efficient real-time online visual SLAM-MOT fusion system is introduced in this paper, which takes binocular vision or RGBD as input. With the help of a 2D object detection network, this approach can track the camera and dynamic object poses efficiently, accurately and robustly, while generating a sparse point cloud map. Additionally, to improve the precision and accuracy of multi-dynamic object tracking, a strategy combining the cascaded matching and IOU matching strategy is introduced. The Ackermann steering model is used to simplify the motion estimation of the tracked objects to reduce number of matching points required to solve dynamic target pose. By employing a factor graph, the tracking results of both the camera and dynamic objects are jointly optimized, simultaneously enhancing the accuracy of the camera, object poses, and map points. Finally, the proposed method is compared with other approaches using the KITTI tracking dataset. The results show that, while satisfying real-time requirements, this method can still achieve accurate camera and dynamic object pose tracking.

Key words: visual SLAM, dynamic environment, multiple objects tracking, real-time system