动态场景下基于3D多目标追踪的实时视觉SLAM方法研究

doi:10.19562/j.chinasae.qcgc.2024.05.004

汽车工程 ›› 2024, Vol. 46 ›› Issue (5): 776-783.doi: 10.19562/j.chinasae.qcgc.2024.05.004

• • 上一篇

动态场景下基于3D多目标追踪的实时视觉SLAM方法研究

陈吉清^1,²,车宇翔^1,²,田小强^1,²,兰凤崇^1,²,周云郊^1,²()

^1.华南理工大学机械与汽车工程学院，广州 510640
^2.华南理工大学，广东省汽车工程重点实验室，广州 510640

收稿日期:2023-11-12 修回日期:2023-12-26 出版日期:2024-05-25 发布日期:2024-05-17
通讯作者: 周云郊 E-mail:mezhouyj@scut.edu.cn
基金资助:
国家自然科学基金(52175267);广东省自然科学基金(2021A1515010912);国家车辆事故深度调查体系（NAIS）和新能源汽车事故调查协作网资助

Research on Real-Time Visual SLAM Method Based on 3D Multi-Object Tracking in Dynamic Scenes

Jiqing Chen^1,²,Yuxiang Che^1,²,Xiaoqiang Tian^1,²,Fengchong Lan^1,²,Yunjiao Zhou^1,²()

^1.School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou 510640
^2.South China University of Technology, Guangdong Provincial Key Laboratory of Automotive Engineering, Guangzhou 510640

Received:2023-11-12 Revised:2023-12-26 Online:2024-05-25 Published:2024-05-17
Contact: Yunjiao Zhou E-mail:mezhouyj@scut.edu.cn

摘要/Abstract

摘要：

近年来一些解决动态场景下的SLAM技术被提出，其中SLAM与MOT结合的技术路线不仅可解决动态场景问题，还可以提高系统对周围场景的理解，获得了更大关注。本文介绍了一种高效的实时在线视觉SLAMMOT融合系统，以双目视觉或RGBD作为输入，只须借助2D目标检测网络，便能高效、准确、鲁棒地跟踪相机以及动态目标的位姿，并生成稀疏点云地图。为提高多动态目标追踪的精度与准确度，引入了级联匹配与IOU匹配结合的策略；利用阿克曼转向模型来简化追踪目标的运动，减少求解动态目标位姿所需匹配点的数量；利用因子图将相机与动态目标的追踪结果进行联合优化，同时提高相机、追踪目标的位姿和地图点的精度。最后在KITTI跟踪数据集上与其他方法进行比较。结果表明，在满足实时性要求的前提下，该方法仍能准确地追踪相机以及动态目标位姿。

关键词: 视觉SLAM, 动态场景, 多目标追踪, 实时系统

Abstract:

In recent years， some technologies have emerged to tackle the challenges of Simultaneous Localization and Mapping （SLAM） in dynamic scenes， among which the integration of SLAM and moving object tracking （MOT） has gained significant attention as it not only tackles the problem of dynamic scenes but also enhances the system's understanding of the surrounding environment. In this paper， an efficient real-time online visual SLAM-MOT fusion system is introduced in this paper， which takes binocular vision or RGBD as input. With the help of a 2D object detection network， this approach can track the camera and dynamic object poses efficiently， accurately and robustly， while generating a sparse point cloud map. Additionally， to improve the precision and accuracy of multi-dynamic object tracking， a strategy combining the cascaded matching and IOU matching strategy is introduced. The Ackermann steering model is used to simplify the motion estimation of the tracked objects to reduce number of matching points required to solve dynamic target pose. By employing a factor graph， the tracking results of both the camera and dynamic objects are jointly optimized， simultaneously enhancing the accuracy of the camera， object poses， and map points. Finally， the proposed method is compared with other approaches using the KITTI tracking dataset. The results show that， while satisfying real-time requirements， this method can still achieve accurate camera and dynamic object pose tracking.

Key words: visual SLAM, dynamic environment, multiple objects tracking, real-time system

陈吉清,车宇翔,田小强,兰凤崇,周云郊. 动态场景下基于3D多目标追踪的实时视觉SLAM方法研究[J]. 汽车工程, 2024, 46(5): 776-783.

Jiqing Chen,Yuxiang Che,Xiaoqiang Tian,Fengchong Lan,Yunjiao Zhou. Research on Real-Time Visual SLAM Method Based on 3D Multi-Object Tracking in Dynamic Scenes[J]. Automotive Engineering, 2024, 46(5): 776-783.

图/表 11

图1

图2

图3

图4

图5

图6

图7

图8

图9

表1

表2

参考文献 19

1	XU Z， RONG Z， WU Y. A survey： which features are required for dynamic visual simultaneous localization and mapping？［J］. Vis Comput Ind Biomed Art， 2021， 4（1）： 20.
2	BESCOS B， CAMPOS C， TARDÓS J D， et al. DynaSLAM II： tightly-coupled multi-object tracking and SLAM［J］. IEEE Robotics and Automation Letters， 2021， 6（3）：5191-5198.
3	GONZALEZ M， MARCHAND E， KACET A， et al. TwistSLAM： constrained SLAM in dynamic environment［J］. IEEE Robotics and Automation Letters， 2022， 7（3）： 6846-6853.
4	BESCOS B， FÁCIL J M， CIVERA J， et al. DynaSLAM： tracking， mapping， and inpainting in dynamic scenes［J］. IEEE Robotics and Automation Letters， 2018， 3（4）：4076-4083.
5	YU C， LIU Z， LIU X J， et al. DS-SLAM： a semantic visual SLAM towards dynamic environments［C］. IEEE /RSJ International Conference on Intelligent Robots and Systems （IROS）. IEEE， 2018： 1168-1174.
6	RÜNZ M， AGAPITO L. Co-fusion： real-time segmentation， tracking and fusion of multiple objects［C］. IEEE International Conference on Robotics and Automation （ICRA）. IEEE， 2017： 4471-4478.
7	ZHANG J， HENEIN M， MAHONY R， et al. VDO-SLAM： a visual dynamic object-aware SLAM system［J］. 2020. DOI： 10.48550/arXiv.2005.11052.
8	YANG S， SCHERER S. CubeSLAM： monocular 3-D object SLAM［J］. IEEE Transactions on Robotics， 2019， 35（4）： 925-938.
9	YE T， ZHAO G. RT-SLAM：real-time visual dynamic object tracking SLAM［C］. IEEE 6th Information Technology， Networking， Electronic and Automation Control Conference（ITNEC）. IEEE， 2023：677-682.
10	HENEIN M， KENNEDY G， MAHONY R， et al. Exploiting rigid bodymotion for SLAM in dynamic environments［C］. IEEE ICRA. IEEE， 2018.
11	陈建华. 面向自主地面车辆的立体视觉里程计定位技术研究［D］. 长春：吉林大学， 2019.
	CHEN Jianhua. Research on positional technology of stereo visual odometry for autonomous land vehicles［D］. Changchun： Jilin University， 2019.
12	MUR-ARTAL R， TARD´OS J D. ORB-SLAM2：an open-source SLAM system for monocular，stereo， and RGB-D cameras［J］. IEEE Transactions on Robotics， 2017， 33（5）： 1255-1262.
13	CAI Zhaowei， FAN Quanfu， FERIS R， et al. A unified multi-scale deep convolutional neural network for fast object detection［C］. Computer Vision -ECCV， 2016， 9908.
14	GERLACH N L， MEIJER G J， KROON D J， et al. Evaluation of the potential of automatic segmentation of the mandibular canal［J］. Br J Oral Maxillofac Surg.， 2014，52（9）：838-844.
15	WOJKE N， BEWLEY A， PAULUS D. Simple online and realtime tracking with a deep association metric［C］. IEEE International Conference on Image Processing （ICIP）. IEEE， 2017：3645-3649.
16	GEIGER A， LENZ P， STILLER C， et al. Vision meets robotics：the KITTI dataset［J］. International Journal of Robotics Research （IJRR）， 2013， 32（11）： 1231-1237.
17	STURM J， ENGELHARD N， ENDRES F， et al. A benchmark for the evaluation of RGB-D SLAM systems［C］. IEEE International Conference on Intelligent Robots and Systems （IROS）. IEEE， Oct. 2012.
18	ZHANG Z， SCARAMUZZA D. A tutorial on quantitative trajectory evaluation for visual （-inertial） odometry［C］. EEE/RSJ International Conference on Intelligent Robots and Systems （IROS）. IEEE， 2018：7244-7251.
19	李继文. 面向城市环境智能车辆视觉位姿估计方法研究［D］.广州：华南理工大学，2023.
	LI Jiwen. Research on the visual pose estimation method of intelligent vehicles in urban environment［D］. Guangzhou： South China University of Technology， 2023.

操作	平均耗时/ms	占比/%
ORB特征提取	19.6	26.9
相机、追踪目标位姿估计	11.6	16
位姿优化	26.2	36
关键帧生成	15.4	21.1
总耗时	72.8	100

	DynaSLAMII		VDO-SLAM （RGB-D）				Cube-SLAM （RGB-D）				本文算法
序号	相机位姿		相机位姿		追踪对象位姿		相机位姿		追踪对象位姿		相机位姿		追踪对象位姿
序号	RPE_t/ （m·f^-1）	RPE_R/ （（°）·f^-1）	RPE_t/ （m·f^-1）	RPE_R/ （（°）·f^-1）	RPE_t/ （m·f^-1）	RPE_R/ （（°）·f^-1）	RPE_t/ （m·f^-1）	RPE_R/ （（°）·f^-1）	RPE_t/ （m·f^-1）	RPE_R/ （（°）·f^-1）	RPE_t/ （m·f^-1）	RPE_R/ （（°）·f^-1）	RPE_t/ （m·f^-1）	RPE_R/ （（°）·f^-1）
00	0.04	0.06	0.06	0.07	0.11	1.05					0.04	0.06
01	0.05	0.04	0.12	0.04	0.16	0.91					0.06	0.05	0.30	1.80
02	0.04	0.02	0.04	0.02	0.28	1.24					0.04	0.02	0.25	1.51
03	0.06	0.04	0.08	0.03	0.10	0.30	0.10	0.05	4.60	3.61	0.08	0.06	0.27	0.87
04	0.07	0.06	0.11	0.05	0.19	0.83	0.12	0.07	32.5	5.60	0.07	0.06
05	0.06	0.03	0.10	0.02	0.11	0.37	0.07	0.02	6.49	3.26	0.06	0.03	0.23	0.63
06	0.02	0.04	0.02	0.05	0.12	1.08					0.02	0.04	0.20	1.25
18	0.05	0.02	0.07	0.02	0.08	0.25	0.05	0.04	3.80	3.19	0.05	0.03
19	0.05	0.03									0.05	0.03
20	0.07	0.04	0.16	0.03	0.08	0.37	0.19	0.13	5.70	3.42	0.05	0.05

[1]	芦涛,金馨,廖毅霏,黄圣杰,杨依琳,谢国涛,秦晓辉. 基于雅克比域零空间边缘化的视觉SLAM[J]. 汽车工程, 2023, 45(8): 1457-1467.
[2]	黄圣杰,胡满江,周云水,殷周平,秦晓辉,边有钢,贾倩倩. 动态场景下基于语义分割与运动一致性约束的车辆视觉SLAM[J]. 汽车工程, 2022, 44(10): 1503-1510.

动态场景下基于3D多目标追踪的实时视觉SLAM方法研究

Research on Real-Time Visual SLAM Method Based on 3D Multi-Object Tracking in Dynamic Scenes

RichHTML

PDF (PC)

赞

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 19

相关文章 2

Metrics

本文评价

推荐阅读 0