动态场景下基于语义分割与运动一致性约束的车辆视觉SLAM

doi:10.19562/j.chinasae.qcgc.2022.10.004

摘要/Abstract

摘要：

传统的车辆同时定位与建图方法多依赖于静态环境假设，在动态场景下易引起位姿估计精度下降甚至前端视觉里程计跟踪失败。本文结合Fast-SCNN实时语义分割网络与运动一致性约束，提出一种动态场景视觉SLAM方法。首先利用Fast-SCNN获取潜在动态目标的分割掩码并进行特征点去除，以获取相机位姿的初步估计；随后基于运动约束与卡方检验完成潜在动态目标中静态点的重添加，以进一步优化相机位姿。验证集测试表明，所训练的语义分割网络平均像素精度和交并比超过90%，单帧图片处理耗时约14.5 ms，满足SLAM系统的分割精度与实时性要求。慕尼黑大学公开数据集和实车数据集测试表明，融合本文算法的ORB-SLAM3部分指标平均提升率超过80%，显著提升了动态场景下的SLAM运行精度与鲁棒性，有助于保障智能车辆的安全性。

关键词: 智能车辆, 同时定位与建图, 语义分割, 动态场景, 运动一致性

Abstract:

Traditional simultaneous localization and mapping （SLAM） methods for vehicles generally rely on the assumption of static environment， so the positional estimation accuracy may be decreased and the front-end visual odometer may even fail to track in dynamic scenes. This paper proposes a SLAM method for dynamic scenes by combining Fast-SCNN real-time semantic segmentation network and motion consistency constraints. Firstly， FAST-SCNN is used to obtain a segmentation mask of potential dynamic targets and remove the feature points to obtain a preliminary estimation of the camera position. Subsequently， based on the motion constraints and the chi-square test， the static points in the potential dynamic target are added again to further optimize the camera pose. The validation set test results show that the average pixel accuracy and mean intersection over union （mIOU） of the proposed semantic segmentation network is greater than 90%， with the processing time for 1 frame of picture is about 14.5 milliseconds， which meets the segmentation accuracy and real-time requirements of the SLAM system. Based on the public data set of TUM and real vehicle data set， the average performance improvement by using the proposed method exceeds by 80% over ORB-SLAM3 in various indicators， which significantly enhances the operating accuracy and robustness of SLAM in dynamic scenes and hence guarantees driving safety of intelligent vehicles.

Key words: intelligent vehicle, simultaneous localization and mapping, semantic segmentation, dynamic scenes, motion consistency

黄圣杰,胡满江,周云水,殷周平,秦晓辉,边有钢,贾倩倩. 动态场景下基于语义分割与运动一致性约束的车辆视觉SLAM[J]. 汽车工程, 2022, 44(10): 1503-1510.

Shengjie Huang,Manjiang Hu,Yunshui Zhou,Zhouping Yin,Xiaohui Qin,Yougang Bian,Qianqian Jia. Vehicle Visual SLAM in Dynamic Scenes Based on Semantic Segmentation and Motion Consistency Constraints[J]. Automotive Engineering, 2022, 44(10): 1503-1510.

图/表 7

图1

图2

图3

表1

图4

图5

表2

参考文献 23

1	胡玉文，龚建伟，姜岩，等. 基于子地图的智能车辆同步定位与地图创建［J］. 汽车工程， 2015， 37（2）： 224-229.
	HU Yuwen， GONG Jianwei， JIANG Yan， et al. A sub-map-based simultaneous localization and mapping technique for intelligent vehicles［J］. Automotive Engineering， 2015， 37（2）： 224-229.
2	MUR-ARTAL R， MONTIEL J M M， TARDÓS J D. ORB-SLAM： a versatile and accurate monocular SLAM system［J］. CoRR，2015，abs/1502.00956（5）.
3	MUR-ARTAL R， TARDOS J D. ORB-SLAM2： an open-source SLAM system for monocular， stereo， and RGB-D cameras［J］. IEEE Transactions on Robotics，2017，33（5）.
4	CAMPOS C， ELVIRA R， Rodríguez J J G， et al. Orb-slam3： an accurate open-source library for visual， visual-inertial， and multimap slam［J］. IEEE Transactions on Robotics， 2021， 37（6）： 1874-1890.
5	魏彤，李绪.动态环境下基于动态区域剔除的双目视觉SLAM算法［J］.机器人，2020，42（3）：336-345.DOI：10.13973/j.cnki.robot.190296.
	WEI Tong， LI Xu. Binocular vision SLAM algorithm based on dynamic region elimination in dynamic environment［J］. Robot，2020，42（3）：336-345.DOI：10.13973/j.cnki.robot.190296.
6	ZOU D， TAN P. Coslam： collaborative visual slam in dynamic environments［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2012， 35（2）： 354-366.
7	DAI W， ZHANG Y， LI P， et al. Rgb-d slam in dynamic environments using point correlations［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2020， 44（1）： 373-389.
8	ZHANG T， ZHANG H， LI Y， et al. Flowfusion： dynamic dense RGB-D slam based on optical flow［C］. 2020 IEEE International Conference on Robotics and Automation （ICRA）. IEEE， 2020： 7322-7328.
9	DEROME M， PLYER A， SANFOURCHE M， et al. Moving object detection in real-time using stereo from a mobile platform［J］. Unmanned Systems， 2015， 3（4）： 253-266.
10	LI H， HARTLEY R. Five-point motion estimation made easy［C］. 18th International Conference on Pattern Recognition （ICPR'06）. IEEE， 2006， 1： 630-633.
11	RUNZ M， BUFFIERU M， AGAPITO L. Maskfusion： real-time recognition， tracking and reconstruction of multiple moving objects［C］. 2018 IEEE International Symposium on Mixed and Augmented Reality （ISMAR）. IEEE， 2018： 10-20.
12	HE K， GKIOXARI G， DOLLAR P， et al. Mask R-CNN［C］. Proceedings of the IEEE International Conference on Computer Vision. 2017： 2961-2969.
13	YU C， LIU Z， LIU X J， et al. DS-SLAM： a semantic visual SLAM towards dynamic environments［C］. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems （IROS）. IEEE， 2018： 1168-1174.
14	BADRINARAYANAN V， KENDALL A， CIPOLLA R. Segnet： a deep convolutional encoder-decoder architecture for image segmentation［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（12）： 2481-2495.
15	程腾，孙磊，侯登超，等. 基于特征融合的多层次多模态目标检测［J］. 汽车工程， 2021， 43（11）： 1602-1610.
	CHENG T， SUN L， HOU D C， et al. Multi-level and multi-modal target detection based on feature fusion［J］. Automotive Engineering， 2021， 43（11）： 1602-1610.
16	BESCOS B， Fácil J M， CIVERA J， et al. dynaSLAM： tracking， mapping， and inpainting in dynamic scenes［J］. IEEE Robotics and Automation Letters， 2018， 3（4）： 4076-4083.
17	POUDEL R P K， LIWICKI S， CIPOLLA R. Fast-scnn： fast semantic segmentation network［J］. arXiv Preprint arXiv：， 2019.
18	POUDEL R P K， BONDE U， LIWICKI S， et al. Contextnet： exploring context and detail for semantic segmentation in real-time［J］. arXiv Preprint arXiv：， 2018.
19	MAZZINI D. Guided upsampling network for real-time semantic segmentation［J］. arXiv Preprint arXiv：， 2018.
20	RONNEBERGER O， FISCHER P， BROX T. U-net： convolutional networks for biomedical image segmentation［C］. International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer， Cham， 2015： 234-241.
21	LONG J， SHELHAMER E， DARRELL T. Fully convolutional networks for semantic segmentation［C］. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015： 3431-3440.
22	ZHAO J， LI J， CHENG Y， et al. Understanding humans in crowded scenes： deep nested adversarial learning and a new benchmark for multi-human parsing［C］. Proceedings of the 26th ACM International Conference on Multimedia. 2018： 792-800.
23	KINGMA D P， BA J. Adam： a method for stochastic optimization［J］. arXiv Preprint arXiv：， 2014.

数据集	ORB-SLAM3/m				本文算法/m				提升率%
数据集	mean	median	RMSE	STD	mean	median	RMSE	STD	mean	median	RMSE	STD
sitting_xyz	0.021003	0.018044	0.025258	0.014030	0.019297	0.017107	0.022437	0.011449	8.12	5.19	11.168	18.40
walking_static	0.300586	0.303622	0.321458	0.113944	0.025117	0.023239	0.027872	0.01208	91.64	92.35	91.33	89.40
walking_xyz	0.268512	0.251552	0.297828	0.128853	0.030360	0.021355	0.036809	0.020813	88.69	91.51	87.64	83.85
walking_halfsphere	0.580799	0.518529	0.635391	0.257670	0.059415	0.046417	0.069974	0.036962	89.77	97.05	88.99	85.66
实车数据集	4.418357	4.559637	4.980328	2.298213	1.049640	0.881946	1.183512	0.546769	76.24	80.66	76.24	76.21

[1]	秦洪懋,沈国利,周云水,黄圣杰,秦晓辉,谢国涛,丁荣军. 特征稀疏场景下基于标签的车辆视觉SLAM[J]. 汽车工程, 2023, 45(9): 1543-1552.
[2]	金祖亮,隗寒冰,Liu Zheng,娄路,郑国峰. 基于局部窗口交叉注意力的轻量型语义分割[J]. 汽车工程, 2023, 45(9): 1617-1625.
[3]	伍文广,田双岳,张志勇,张斌. 非铺装道路凹凸不平特征语义分割方法研究[J]. 汽车工程, 2023, 45(8): 1468-1478.
[4]	张雷, 关可人, 丁晓林, 郭鹏宇, 王震坡, 孙逢春. 基于图像识别与动力学融合的路面附着系数估计方法[J]. 汽车工程, 2023, 45(7): 1222-1234.
[5]	吕彦直,魏超,何元浩. 基于GCN和CIL的端到端自动驾驶换道方法[J]. 汽车工程, 2023, 45(12): 2310-2317.
[6]	黄润辉,胡立坤,苏鸣方,徐大也,陈奥然. 基于三维锥形栅格的激光点云语义分割方法[J]. 汽车工程, 2022, 44(8): 1173-1182.
[7]	蒋朝阳,兰天然,郑晓妮,高九龙,叶学通. 分布式多车协同视觉SLAM系统[J]. 汽车工程, 2022, 44(12): 1809-1817.
[8]	王大方,刘磊,曹江,赵刚,赵文硕,唐伟. 基于空洞空间池化金字塔的自动驾驶图像语义分割方法[J]. 汽车工程, 2022, 44(12): 1818-1824.
[9]	王大方,尚海,曹江,王涛,夏祥腾,韩雨霖. 基于自注意力机制的自动驾驶场景点云语义分割方法[J]. 汽车工程, 2022, 44(11): 1656-1664.
[10]	夏祥腾,王大方,曹江,赵刚,张京明. 基于稀疏卷积神经网络的车载激光雷达点云语义分割方法[J]. 汽车工程, 2022, 44(1): 26-35.
[11]	王海,蔡柏湘,蔡英凤,刘泽,孙恺,陈龙. 基于语义分割网络的路面积水与湿滑区域检测[J]. 汽车工程, 2021, 43(4): 485-491.
[12]	陈吉清,翁楚滨,兰凤崇. 智能车辆换道潜在冲突分析与风险量化方法[J]. 汽车工程, 2021, 43(11): 1565-1576.
[13]	章军辉,付宗杰,郭晓满,李庆,陈大鹏,赵野. 基于HS⁃FCM模糊聚类的快速多目标车辆跟踪算法[J]. 汽车工程, 2021, 43(10): 1419-1426.
[14]	宋晓琳,盛鑫,曹昊天,李明俊,易滨林,黄智. 基于模仿学习和强化学习的智能车辆换道行为决策[J]. 汽车工程, 2021, 43(1): 59-67.
[15]	陈无畏, 王其东, 丁雨康, 赵林峰, 王慧然, 谢有浩. 基于预期偏移距离的人机权值分配策略研究^*[J]. 汽车工程, 2020, 42(4): 513-521.