基于多粒度关系推理的自动驾驶域自适应视觉目标检测算法

doi:10.19562/j.chinasae.qcgc.2025.02.001

Abstract

Abstract:

Most of the existing domain adaptive visual object detection algorithms are based on two-stage detector design and fail to exploit the semantic topological relationship between different elements in the image space， resulting in suboptimal cross-domain adaptation performance. Therefore， in this paper a domain adaptive visual object detection algorithm based on multi-granularity relationship reasoning is proposed. Firstly， a coarse-grained patch relationship reasoning module is proposed， which uses the coarse-grained patch graph structure to capture the topological relationship between the foreground and background and perform cross-domain adaptation on the foreground area. Then， a fine-grained semantic relationship reasoning module is designed to reason about the fine-grained semantic graph structure to enhance cross-domain multi-category semantic dependencies. Finally， a granularity-induced feature alignment module is proposed to adjust the weight of feature alignment according to the affinity of the nodes， thereby improving the adaptability of the detection model when facing overall scene changes. The experimental results on multiple cross-domain scenarios of autonomous driving verify the robustness and real-time performance of the proposed algorithm.

Key words: autonomous driving, visual object detection, domain adaptation, graph reasoning

Jinhui Suo, Xiaowei Wang, Peiwen Jiang, Chi Ding, Ming Gao, Yougang Bian. Domain Adaptive Visual Object Detection for Autonomous Driving Based on Multi-granularity Relation Reasoning[J].Automotive Engineering, 2025, 47(2): 201-210.

Figures/Tables 13

References 40

1	FAHRENKROG F， REITHINGER S， GÜLSEN B， et al. European research project’s contributions to a safer automated road traffic［J］. Automotive Innovation， 2023， 6（4）： 521-530.
2	HE X， LV C. Towards safe autonomous driving： decision making with observation-robust reinforcement learning［J］. Automotive Innovation， 2023， 6（4）： 509-520.
3	赵东宇，赵树恩. 基于级联YOLOv7的自动驾驶三维目标检测［J］. 汽车工程， 2023， 45（7）： 1112-1122.
	ZHAO D Y， ZHAO S E. Autonomous driving 3D object detection based on cascade YOLOv7［J］. Automotive Engineering， 2023， 45（7）： 1112-1122.
4	张炳力，秦浩然. 基于RetinaNet及优化损失函数的夜间车辆检测方法［J］. 汽车工程， 2021， 43（8）： 1195-1202.
	ZHANG B L， QIN H R. A method of vehicle detection at night based on RetinaNet and optimized loss functions［J］. Automotive Engineering， 2021， 43（8）： 1195-1202.
5	CHEN Y， LI W， SAKARIDIS C， et al. Domain adaptive faster R-CNN for object detection in the wild［C］. IEEE/CVF Conference on Computer Vision and Pattern Recognition， 2018： 3339-3348.
6	SAITO K， USHIKU Y， HARADA T， et al. Strong-weak distribution alignment for adaptive object detection［C］. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition， 2019： 6956-6965.
7	胡杰，徐博远，熊宗权，等. 基于多尺度掩码分类域自适应网络的跨域目标检测算法［J］. 汽车工程， 2022， 44（9）： 1327-1338.
	HU J， XU B Y， XIONG Z Q， et al. Cross-domain object detection algorithm based on multi-scale mask classification domain adaptive network［J］. Automotive Engineering， 2022， 44（9）： 1327-1338.
8	ZHENG Y， HUANG D， LIU S， et al. Cross-domain object detection through coarse-to-fine feature adaptation［C］. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition， 2020： 13766-13775.
9	LI W， LIU X， YUAN Y. SIGMA： semantic-complete graph matching for domain adaptive object detection［C］. IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）， 2022： 5281-5290.
10	REN S， HE K， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks［J］. Advances in Neural Information Processing Systems， 2015， 28.
11	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot multibox detector［C］. ECCV 2016： 21-37.
12	TIAN Z， SHEN C， CHEN H， et al. FCOS： fully convolutional one-stage object detection［C］. Proceedings of the IEEE/CVF International Conference on Computer Vision， 2019： 9627-9636.
13	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection［C］. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition， 2016： 779-788.
14	REDMON J， FARHADI A. Yolov3： an incremental improvement［J］. arXiv preprint arXiv：，2018.
15	WANG C Y， YEH I H， LIAO H Y M. Yolov9： learning what you want to learn using programmable gradient information［J］. arXiv preprint arXiv：， 2024.
16	REDMON J， FARHADI A. Yolo9000： better， faster， stronger［C］. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition， 2017： 7263-7271.
17	CHEN C， ZHENG Z， DING X， et al. Harmonizing transferability and discriminability for adapting object detectors［C］. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）， 2020： 8866-8875.
18	KIPF T N， WELLING M. Semi-supervised classification with graph convolutional networks［J］. arXiv， 2017.
19	MAATEN L， HINTON G E. Visualizing data using t-SNE［J］. Journal of Machine Learning Research， 2008， 9（11）.
20	CORDTS M， OMRAN M， RAMOS S， et al. The cityscapes dataset for semantic urban scene understanding［C］. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition， 2016： 3213-3223.
21	SAKARIDIS C， DAI D， VAN GOOL L. Semantic foggy scene understanding with synthetic data［J］. International Journal of Computer Vision， 2018， 126（9）： 973-992.
22	GEIGER A， LENZ P， STILLER C， et al. Vision meets robotics： the KITTI dataset［J］. The International Journal of Robotics Research， 2013， 32（11）： 1231-1237.
23	YU F， CHEN H， WANG X， et al. BDD100K： a diverse driving dataset for heterogeneous multitask learning［C］. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition， 2020： 2636-2645.
24	LIN T Y， MAIRE M， BELONGIE S， et al. Microsoft COCO： common objects in context［C］. ECCV 2014： 740-755.
25	VS V， GUPTA V， OZA P， et al. MeGA-CDA： memory guided attention for category-aware unsupervised domain adaptive object detection［C］. IEEE/CVF Conference on Computer Vision and Pattern Recognition， 2021： 4514-4524.
26	刘正发，吴亚，刘佩根，等. 基于特征和标签联合分布匹配的智能驾驶跨域自适应目标检测［J］. 汽车工程， 2023， 45（11）： 2082-2103.
	LIU Z F， WU Y， LIU P G， et al. Cross-domain object detection for intelligent driving based on joint distribution matching of features and labels［J］. Automotive Engineering， 2023， 45（11）： 2082-2103.
27	HSU C C， TSAI Y H， LIN Y Y， et al. Every pixel matters： center-aware feature alignment for domain adaptive object detector［C］. ECCV 2020： 733-748.
28	TIAN K， ZHANG C， WANG Y， et al. Knowledge mining and transferring for domain adaptive object detection［C］. IEEE/CVF International Conference on Computer Vision （ICCV）， 2021： 9113-9122.
29	ZHANG S， TUO H， HU J， et al. Domain adaptive YOLO for one-stage cross-domain detection［C］. Proceedings of The 13th Asian Conference on Machine Learning， 2021： 785-797.
30	LI G， JI Z， QU X， et al. Cross-domain object detection for autonomous driving： a stepwise domain adaptative YOLO approach［J］. IEEE Transactions on Intelligent Vehicles， 2022， 7（3）： 603-615.
31	MATTOLIN G， ZANELLA L， RICCI E， et al. Confmix： unsupervised domain adaptation for object detection via confidence-based mixing［C］. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision， 2023： 423-433.
32	LI W， LIU X， YUAN Y. SIGMA++： improved semantic-complete graph matching for domain adaptive object detection［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2023， 45（7）： 9022-9040.
33	XIE R， YU F， WANG J， et al. Multi-level domain adaptive learning for cross-domain detection［C］. IEEE/CVF International Conference on Computer Vision （ICCV）， 2019： 3213-3219.
34	YANG X， WAN S， JIN P. Domain-invariant region proposal network for cross-domain detection［C］. IEEE International Conference on Multimedia and Expo （ICME）， 2020： 1-6.
35	WANG X， JIANG P， LI Y， et al. Progressive critical region transfer for cross-domain visual object detection［J］. IEEE Transactions on Intelligent Transportation Systems， 2024： 1-15.
36	CAI M， LUO M， ZHONG X， et al. Uncertainty-aware model adaptation for unsupervised cross-domain object detection［J］. arXiv preprint arXiv：， 2021.
37	KHINDKAR V， ARORA C， BALASUBRAMANIAN V N， et al. To miss-attend is to misalign！ residual self-attentive feature alignment for adapting object detectors［C］. IEEE/CVF Winter Conference on Applications of Computer Vision （WACV）， 2022： 376-386.
38	HE M， WANG Y， WU J， et al. Cross domain object detection by target-perceived dual branch distillation［C］. IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）， 2022： 9560-9570.
39	LI G， JI Z， QU X. Stepwise domain adaptation （SDA） for object detection in autonomous vehicles using an adaptive centernet［J］. IEEE Transactions on Intelligent Transportation Systems， 2022， 23（10）： 17729-17743.
40	XU M， WANG H， NI B， et al. Cross-domain detection via graph-induced prototype alignment［C］. IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）， 2020： 12352-12361.

[1]	ZHOU Wei, ZHANG Cheng-Ning, LI Jun-Qiu. [J]. , 2016, 38(12): 1407 -1414 .
[2]	WANG Dan, LIU Zhong-Chang, TIAN Jing, HAN Yong-Qiang, TAN Man-Zhi. [J]. , 2016, 38(12): 1415 -1419 .
[3]	LI Huan, HUANG Ying, ZHANG Fu-Jun, ZHAO Yu, GE Yan-Wu. [J]. , 2016, 38(12): 1420 -1426 .
[4]	HAO Han, WANG Si-南, LI Xiao, LIU Zong-Wei, ZHAO Fu-Quan. [J]. , 2017, 39(1): 1 -8 .
[5]	ZONG Yi-Qi, GU Zheng-Qi, LUO Ze-Min, JIANG Cai-Mao, ZHANG Qi-Dong, YANG Zhen-Dong. [J]. , 2016, 38(12): 1427 -1433 .
[6]	ZHANG Ying-Chao, ZHAN Da-Peng, ZHAO Jing, ZHANG Zhe, LI Jie. [J]. , 2016, 38(12): 1434 -1439 .
[7]	XU Cheng-Shan, JIANG Fa-Chao, SONG Sen-Nan, TIAN Guang-Yu. [J]. , 2017, 39(1): 9 -14 .
[8]	SHEN Zhe, WANG Yi-Gang, YANG Zhi-Gang, LI Fang-Xu. [J]. , 2016, 38(12): 1440 -1445 .
[9]	ZENG Bi-Qiang, GAO Ji-Dong, PENG Wei, SUN Zhen-Dong. [J]. , 2016, 38(12): 1446 -1451 .
[10]	TANG You-Ming, YAN Ling-Bo, LUO Qian, CAO Li-Bo. [J]. , 2016, 38(12): 1452 -1458 .

方法	检测器	行人	骑手	汽车	货车	公交车	火车	摩托车	自行车	mAP
Baseline	YOLOv5	36.9	38.4	49.0	20.6	30.1	5.2	14.5	28.7	27.9
C2F^［8］	Faster RCNN	34.0	46.9	52.1	30.8	43.2	29.9	34.7	37.4	38.6
MeGA^［20］	Faster RCNN	37.7	49.0	52.4	25.4	49.2	46.9	34.5	39.0	41.8
MMCN^［7］	Faster RCNN	33.4	46.8	51.9	29.1	48.4	43.2	36.0	37.4	40.8
FLDMN^［21］	Faster RCNN	33.4	45.4	50.9	29.9	55.4	38.3	33.4	36.5	40.4
EPM^［22］	FCOS	41.5	43.6	57.1	29.4	44.9	39.7	29.0	36.1	40.2
KTNet^［23］	FCOS	43.0	42.7	60.0	32.3	46.6	38.4	31.2	38.2	41.5
SIGMA^［9］	FCOS	46.9	48.4	63.7	27.1	50.7	35.9	34.7	41.4	43.5
DA-YOLO^［24］	YOLOv3	29.5	27.7	46.1	9.1	28.2	4.5	12.7	24.8	36.1
S-DAYOLO^［25］	YOLOv5	42.6	42.1	61.9	23.5	40.5	39.5	24.4	37.3	39.0
ConfMix^［26］	YOLOv5	45.0	43.4	62.6	27.3	45.8	40.0	28.6	33.5	40.8
MGR²（本文）	YOLOv5	44.1	47.8	62.4	28.1	51.8	54.0	29.7	41.2	44.9
Oracle	YOLOv5	46.4	49.4	67.5	29.8	55.1	52.2	35.5	40.9	47.1

方法	检测器	行人	骑手	汽车	货车	火车	mAP
Baseline	YOLOv5	55.5	15.3	80.3	26.1	21.4	39.7
MLDA^［32］	Faster RCNN	53.0	24.5	72.2	28.7	25.3	40.7
C2F^［8］	Faster RCNN	50.4	29.7	73.6	29.7	21.6	41.0
DI-FR^［33］	Faster RCNN	58.5	37.2	75.4	30.6	18.5	44.0
PCRT^［34］	Faster RCNN	58.8	19.4	80.1	29.9	39.6	45.6
MGR²（本文）	YOLOv5	56.2	16.5	82.6	48.3	32.7	47.3
Oracle	YOLOv5	84.4	88.0	96.0	87.6	80.4	87.3

方法	检测器	行人	骑手	汽车	货车	公交车	摩托车	自行车	mAP
Baseline	YOLOv5	37.4	24.6	58.9	19.1	20.0	16.3	21.2	28.2
PCRT^［34］	Faster RCNN	39.1	30.4	55.9	15.3	17.5	21.8	30.1	30.0
UAMA^［35］	Faster RCNN	37.3	32.9	55.8	19.0	15.4	17.6	27.0	29.3
ILLUME^［36］	Faster RCNN	33.2	20.5	47.8	20.8	33.8	24.4	26.7	29.6
TDD^［37］	Faster RCNN	39.6	38.9	53.9	24.1	25.5	24.5	28.8	33.6
SIGMA++^［38］	FCOS	47.5	30.4	65.6	21.1	26.3	17.8	27.1	33.7
S-DAYOLO^［25］	YOLOv5	48.4	29.1	64.5	29.5	28.6	14.4	20.5	33.6
MGR²（本文）	YOLOv5	45.2	34.7	65.0	25.2	29.7	21.1	31.0	36.0
Oracle	YOLOv5	52.8	38.0	73.2	50.4	48.3	32.9	37.0	47.5

方法	检测器	行人	骑手	汽车	货车	公交车	摩托车	自行车	mAP
Baseline	YOLOv5	40.4	20.2	60.7	31.4	36.6	10.2	27.5	32.4
SDA^［39］	CenterNet	42.8	26.4	53.9	33.5	36.5	20.4	28.2	34.5
S-DAYOLO^［25］	YOLOv5	44.8	25.1	63.9	39.4	42.6	27.5	32.5	39.4
MGR²（本文）	YOLOv5	45.8	31.0	67.7	49.9	48.7	29.7	40.0	44.7
Oracle	YOLOv5	49.6	32.3	73.6	52.8	52.3	38.5	40.1	48.5

方法	mAP/%	净提升/%
Baseline	27.9
w/o CGPR²	42.8	14.9
w/o FGSR²	43.0	15.1
w/o GIFA	41.1	13.2
全模型	44.9	17.0
Oracle	47.1

Domain Adaptive Visual Object Detection for Autonomous Driving Based on Multi-granularity Relation Reasoning

RichHTML

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 13

References 40

Related Articles 15

Metrics

Comments

Recommended 10

方法	检测器	FPS	mAP/%
GPA^［40］	Faster RCNN	22.98	39.5
SIGMA^［9］	FCOS	78.25	44.2
Baseline	YOLOv5	106.27	27.9
MGR²（本文）	YOLOv5	106.61	44.9

[1]	Jiangkun Li,Ruixue Zong,Weiwen Deng,Ying Wang,Juan Ding. Directed Graph-Based Method for Evaluating Similarity in Urban Intersection Scenarios [J]. Automotive Engineering, 2025, 47(1): 23-34.
[2]	Daofei Li,Hao Pan. Application of Scenario Complexity Evaluation in Trajectory Prediction and Automated Driving Decision-Making [J]. Automotive Engineering, 2024, 46(9): 1556-1563.
[3]	Hai Wang,Jianguo Li,Yingfeng Cai,Long Chen. A LiDAR-Based Dynamic Driving Scene Multi-task Segmentation Network [J]. Automotive Engineering, 2024, 46(9): 1608-1616.
[4]	Jianan Zhang,Zhaozheng Hu,Jie Meng,Huahua Hu,Jie Zuo. Distributed Simulation Platform Architecture and Application of Autonomous Driving for Vehicle-Road-Map Collaboration [J]. Automotive Engineering, 2024, 46(8): 1335-1345.
[5]	Le Tao,Hai Wang,Yingfeng Cai,Long Chen. Multi-object Detection Algorithm Based on Point Cloud for Autonomous Driving Scenarios [J]. Automotive Engineering, 2024, 46(7): 1208-1218.
[6]	Linhui Li,Yifan Fu,Ting Wang,Xuecheng Wang,Jing Lian. Trajectory Prediction Method Enhanced by Self-supervised Pretraining [J]. Automotive Engineering, 2024, 46(7): 1219-1227.
[7]	Hai Wang,Yuxuan Ding,Tong Luo,Meng Qiu,Yingfeng Cai,Long Chen. A Multi-class Multi-target Tracking Algorithm Combining Motion Speed and Appearance Features in Driving Scenarios [J]. Automotive Engineering, 2024, 46(6): 956-964.
[8]	Jing Huang,Xiangzhen Liu,Xiaoyang Deng,Ran Chen. Research on Intelligent Vehicle Trajectory Planning Based on Multimodal Trajectory Prediction [J]. Automotive Engineering, 2024, 46(6): 965-974.
[9]	Fuxing Yao,Chao Sun,Yungang Lan,Bing Lu,Bo Wang,Haiyang Yu. A Lane Change Decision Method for Intelligent Connected Vehicles Based on Mixture of Expert Model [J]. Automotive Engineering, 2024, 46(5): 882-892.
[10]	Mengfan Li,Zhongxiang Feng,Weihua Zhang,Jingyu Li. Study on Driver's Visual Transfer Characteristics During the Takeover Process of Human-Computer Co-driving Mode [J]. Automotive Engineering, 2024, 46(5): 795-804.
[11]	Ting Chikit,Yafei Wang,Yichen Zhang,Mingyu Wu,Yile Wang. Energy-Saving Planning Method for Autonomous Driving Mining Trucks Based on Composite Dynamic Sampling [J]. Automotive Engineering, 2024, 46(4): 588-595.
[12]	Yiwei Zhou,Mo Xia,Bing Zhu. Multimodal Vehicle Trajectory Prediction Methods Considering Multiple Traffic Participants in Urban Road Scenarios [J]. Automotive Engineering, 2024, 46(3): 396-406.
[13]	Xiaocong Zhao,Shiyu Fang,Zirui Li,Jian Sun. Extraction and Application of Key Utility Term for Social Driving Interaction [J]. Automotive Engineering, 2024, 46(2): 230-240.
[14]	Kaibo Huang,Weiwen Deng,Ying Wang,Rui Zhao,Juan Ding. Data Collection and Annotation Method for Radar on Some Key Scenarios [J]. Automotive Engineering, 2024, 46(12): 2257-2266.
[15]	Xiaolin Tang,Lu Gan,Guofa Li,Keqiang Li,Wenbo Chu. Large Model Alignment Technology for Autonomous Driving: A Review [J]. Automotive Engineering, 2024, 46(11): 1937-1951.