汽车工程 ›› 2023, Vol. 45 ›› Issue (12): 2280-2290.doi: 10.19562/j.chinasae.qcgc.2023.12.010

所属专题: 智能网联汽车技术专题-感知&HMI&测评2023年

• • 上一篇    下一篇

基于TC-YOLOv7算法的可见光与红外后融合检测研究

李琳辉1,2,张鑫亮1,付一帆1,连静1,2(),马家旭1   

  1. 1.大连理工大学汽车工程学院,大连 116024
    2.大连理工大学,工业装备结构分析国家重点实验室,大连 116024
  • 收稿日期:2023-04-22 修回日期:2023-05-25 出版日期:2023-12-25 发布日期:2023-12-21
  • 通讯作者: 连静 E-mail:lianjingdlut@126.com
  • 基金资助:
    国家自然科学基金(61976039);大连市科技创新基金(2021JJ12GX015);中央高校基本科研业务费专项基金项目(DUT22JC09)

Research on Visible Light and Infrared Post-Fusion Detection Based on TC-YOLOv7 Algorithm

Linhui Li1,2,Xinliang Zhang1,Yifan Fu1,Jing Lian1,2(),Jiaxu Ma1   

  1. 1.School of Automotive Engineering,Dalian University of Technology,Dalian  116024
    2.Dalian University of Technology,State Key Laboratory of Structural Analysis for Industrial Equipment,Dalian  116024
  • Received:2023-04-22 Revised:2023-05-25 Online:2023-12-25 Published:2023-12-21
  • Contact: Jing Lian E-mail:lianjingdlut@126.com

摘要:

针对自动驾驶复杂场景下的视觉目标难以实现快速准确检测的问题,提出一种基于注意力机制的TC-YOLOv7检测算法,应用于可见光与红外以及后融合场景。首先,基于CBAM和Transformer注意力机制模块改进YOLOv7基准检测模型,并利用多场景数据集进行可见光和红外检测效果验证。其次,构建并验证SS-PostFusion、DS-PostFusion、DD-PostFusion 3种不同非极大值抑制后融合方法的检测效果。最后,结合TC-YOLOv7与DD-PostFusion方法,与单传感器检测结果进行对比。结果表明,在晴天、夜间、雾、雨、雪可见光和红外场景下,TC-YOLOv7相比基准YOLOv7 mAP@.5均有3%以上精度提升;在综合场景测试集中,使用TC-YOLOv7后融合方法相比可见光检测精度提升4.5%,相比红外检测精度提升11.1%,相比YOLOv7后融合方法提升0.6%,且TC-YOLOv7后融合方法的推理速度为39 fps,满足自动驾驶场景实时性要求。

关键词: 深度学习, 传感器融合, YOLO, 注意力机制, 非极大值抑制

Abstract:

For the problem that it is difficult to achieve fast and accurate detection of visual targets in complex scenes of autonomous driving, a TC-YOLOv7 detection algorithm based on attention mechanism is proposed, which is applied to visible light, infrared and post-fusion scenarios. Firstly, the YOLOv7 benchmark detection model is improved based on the CBAM and Transformer attention mechanism modules, and the performance of visible light and infrared detection is verified by multi-scene datasets. Secondly, the detection methods of three different non-maximum suppression post-fusion methods including SS-PostFusion, DS-PostFusion, and DD-PostFusion are constructed, with the performance verified. Finally, the method combining TC-YOLOv7 and DD-PostFusion is compared with the single-sensor detection results. The results show that the TC-YOLOv7 method has more than 3% accuracy improvement compared with the benchmark method YOLOv7 mAP@.5 in daytime, night, haze, rain, snow visible light and infrared scenes. In the comprehensive scene test set, the TC-YOLOv7 post-fusion method improves the detection accuracy by 4.5% compared with visual light detection, by 11.1% compared with infrared detection and by 0.6% compared with the YOLOv7 post-fusion method. Furthermore, the TC-YOLOv7 post-fusion method inference speed is 39 fps, meeting the real-time requirements of autonomous driving scenarios.

Key words: deep learning, sensor fusion, YOLO, attention mechanism, non-maximum suppression